FUNDAMENTAL METHODS OF 


IN Fevinteiaarctiter lim mXexe) alo) aal (ors) 





ALPHA C. CHIANG 





KEVIN WAINWRIGHT 


Preface 





This book is written for those students of economics intent on Icarning the basic mathe- 
matical methods that have become indispensable for a proper understanding of the current 
cconomic literature. Unfortunately, studying mathematics is, for many, something akin to 
taking bitter-tasting medicine—absolutcly necessary, but extremely unpleasant. Such an at- 
titude, referred to as “math anxiety,” has its roots—we believe—largely in the inauspicious 
manner in which mathematics is often presented to students. In the belief that conciseness 
means elegance, explanations offered are frequently too brief for clarity, thus puzzling stu- 
dents and giving them an undeserved sense of intellectual inadequacy. An overly formal 
style of presentation, when not accompanied by any intuitive illustrations or demonstra- 
tions of “relevance,” can impair motivation. An uneven progression in the level of material 
can make certain mathematical topics appear more difficult than they actually are. Finally, 
exercise problems that are excessively sophisticated may tend to shatter students’ confi- 
dence, rather than stimulate thinking as intended. 

With that in mind, we have made a serious effort (o minimize anxiety-causing features. 
‘To the extent possible, patient rather than cryptic explanations arc offered. The style is de- 
liberately informal and “reader-friendly.” As a matter of routine, we try to anticipate and 
answer questions that are likcly to arise in the students’ minds as they read, To underscore 
the relevance of mathematics to economics, we let the analytical needs of economists mo- 
tivate the study of the related mathematical techniques and then illustrate the latter with ap- 
propriate economic models immediately afterward. Also, the mathematical tool kit is built 
up on a carefully graduated schedule, with the elementary tools serving as stepping stones 
to the more advanced tools discussed later. Wherever appropriate, graphic illustrations give 
visual reinforcement to the algebraic results. And we have designed the cxercise problems 
as drills to help solidify grasp and bolster confidence, rather than cxact challenges that 
might unwittingly frustrate and intimidate the novice. 

In this book, the following major types of economic analysis are vovercd; statics (equi- 
librium analysis), comparative statics, optimization problems (as a special type of statics), 
dynamics, and dynamic optimization. To tackle these, the following mathcmatical methods 
are introduced in due course: matrix algebra, differential and integral calculus, differential 
equations, difference equations, and optimal control theory. Because of the substantial 
number of illustrative economic models—-both macro and micro—appearing here, this 
book should be useful also to those who are already mathematically trained but still in need 
of a guide to usher them from the realm of mathematics to the land of economies. For the 
same reason, the book should not only serve as a text for a course on mathematical meth- 
ods, but also as supplementary reading in such courses as microeconomic theory, macro- 
economic theory, and economic growth and development, 

We have attempted to retain the principal objectives and style of the previous cditions, 
However, the present edition contains several significant changes. The material on mathe- 
matical programming is now presented earlier in a now Chap. 13 entitled “Further Topics 
in Optimization.” This chapter bas two major themes: optimization with inequality con- 
straints and the envelope theorem, Under the first theme, the Kuha-Tucker conditions are 














vii 


vili 


Preface 


developed in much the same manner as in the previous edition. However, the topic has been 
chhanced with several new economic applications, including peak-load pricing and con- 
sumer rationing. The second theme is related to the development of the envelope theorem, 
the maximum-value function, and the notion of duality. By applying the envelope theorem 
to various economic models, we derive important results such as Roy’s identity, Shephard’s 
lemma, and Hotelling’s lemma, 

The second major addition to this edition is a new Chap. 20 on optimal control theory. 
The purpose of this chapter is to introduce the reader to the basics of optimal control and 
demonstrate how it may be applied in economics, including examples from natural re- 
source economics and optimal growth theory. The material in this chapter is drawn in great 
part from the discussion of optimal contro! theory in Elements of Dynamic Optimization by 
Alpha C. Chiang (McGraw-Hill 1992, now published by Waveland Press, Inc.), which pre- 
sents a thorough treatment of both optimal control and its precursor, calculus of variations. 

Aside from the two new chapters, there are several significant additions and refinements 
to this edition. In Chap. 3 we have expanded the discussion of solving higher-order poly- 
nomial equations by factoring (Sec. 3.3). In Chap. 4, a new section on Markov chains has 
been added (Sec. 4.7). And, in Chap. 5, we have introduced the checking of the rank of a 
matrix via an echelon matrix (Sec. 5.1), and the Hawkins-Simon condition in connection 
with the Leontief input-output model (Sec. 5.7). With respect to economi¢ applications, 
many new examples have been added and some of the existing applications have been en- 
hanced. A linear version of the IS-LM model has been included in Sec. 5.6, and a more gen- 
cral form of the model in Sec. 8.6 has been expanded to encompass both a closed and open 
cconomy, thereby demonstrating a much richer application of comparative statics to 
general-function models. Other additions include a discussion of expected utility and risk 
preferences (Sec. 9.3), a profit-maximization model that incorporates the Cobb-Douglas 
production function (Sec. 11,6), and a two-period intertemporal choice problem 
(Sec. 12.3). Finally, the exercise problems have been revised and augmented, giving stu- 
dents a greater opportunity to hone their skills. 





Contents 








PARTONE 
INTRODUCTION 1 


Chapter 1 
The Nature of Mathematical 
Economics 2 


11 


1.2 


Mathematical versus Nonmathematical 
Economics 2 

Mathematical Economies versus 
Econometrics 4 


Chapter 2 
Economic Models 5 


21 


2.2 
23 


24 


25 


2.6 


27 


Ingredients of a Mathematical 
Model 5 
Variables, Constants, and Parameters 5 
Equations and Identities 6 
The Real-Number System 7 
The Concept of Sets 8 
Set Notation 9 
Relationships between Sets 9 
Operations on Sets ff 
Laws of Set Operations 12 
Exercise 2.3 14 
Relations and Functions 15 
Ordered Pairs 15 
Relations and Functions 16 
Exercise 24 19 
Types of Function 20 
Constant Functions 20 
Polynomial Functions 20 
Rational Functions 21 
Nonalgebraic Functions 23 
A Digression on Exponents 23 
Exercise 2.5 24 
Functions of Two or More Independent 
Variables 25 
Levels of Generality 27 


PART TWO 


STATIC (OR EQUILIBRIUM) 
ANALYSIS = 29 


Chapter 3 
Equilibrium Analysis in Economics 30 


3.1 The Meaning of Equilibrium 30 
3.2 Partial Market Equilibrium—A Linear 
Model 31 

Constructing the Model 31 
Solution by Elimination of Variatles 33 
Exercise 3.2. 34 

3.3 Partial Market Equilibrium—A Nonlinear 

Model 35 

Quadratic Equation versus Quadratic 
Function 35 
The Quadratic Formula 36 
Another Graphical Solution 37 
Higher-Degree Polynomial Equations 38 
Fyercise 3.3 40 

3.4 General Market Equilibrium 40 
Two-Commodity Market Model 47 
Numerical Example 42 
a-Commodity Case 43 
Solution of a General-Equation System 44 
Exercise 3.4 45 

3.5 Equilibrium in National-Income Analysis 46 
Exercise 3,5 47 


Chapter 4 
Linear Models and Matrix Algebra 48 


4.1 Matrices and Vectors 49 
Matrices as Arrays 49 
Vectors as Special Matrices 30 
Exercise 4.1 57 

4,2 Matrix Operations 51 
Addition and Subtraction of Matrices 54 
Scalar Multiplication 52 


xi 


xii Contents 


Muttiplication of Matrices 33 
The Question of Division 56 
The Z Notation 56 
Exercise 4.2 38 

4.3 Notes on Vector Operations 59 
Multiplication of Vectors SY 
Geomeiric Interpretation of Vector 
Operations 60 
Linear Dependence 62 
Fector Space 63 
Exercise 4.3 65 

4.4 Commutative, Associative, and Distributive 

Laws 67 

Matrix Addition 67 
Matrix Multiplication 68 
Exercise 4.4 69 

4,5 Identity Matrices and Null Matrices 70 
Identity Matrices 70 
Null Matrices 71 
Idiosyncrusies of Matrix Algebra 72 
Exercise 4.5 72 

4.6 Transposes and Inverses 73 
Properties of Transposes 74 
Fiverses and Their Properties 75 
Jnverse Matrix and Solution of 
Linear-Equation System 77 
Exercise 4.6 78 

4.7. Finite Markov Chains 78 
Special Case: Absorbing Markov Chains 81 
Exercise 4.7 81 


Chapter 5 
Linear Models and Matrix Algebra 
(Continued) 82 


5.1 Conditions for Nonsingularity ofa Matrix 82 
Necessary versus Sufficient Conditions 82 
Conditions for Nonsingularity 84 
Rank of a Matrix 85 
Exercise 5.) 87 

5.2 Test of Nonsingularity by Use of 

Determinant 88 
Determinants and Nonsingularity 88 
Evaluating « Third-Order Determinant 89 
Evaluating an nth-Order Determinant by 
Laplace Expansion 91 
Exercise 5.2 93 


5.3 Basic Properties of Determinants 94 
Determinantal Criterion for 
Nonsingularity 96 
Rank ofa Matrtx Redefined 97 
Exercise 5.3 98 

5.4 Finding the Inverse Matrix 99 
Expansion ofa Determinant by Alien 
Cofactors 99 
Marrix Inversion 100 
Exercise 5.4 102 

5.5 Cramer's Rule 103 
Derivation of the Rule £03 
Noie on Homogeneous-Equation Systems {05 
Solution Outcomes for a Linear-Lquation 
System 106 
Exercise 5.5107 

5,6 Application to Market and National- Income 

Models 107 
Market Model 107 
National-Income Model 108 
IS-LM Model: Closed Economy 109 
Matrix Algebra versus Elimination of 
Variables HIT 
Exercise 5.6 712 

5.7 Leontief Input-Output Models 112 
Structure ofan Input-Ouiput Model 142 
The Open Model 173 
A Numerical Example 115 
The Existence uf Nonnegative Solutions 116 
Economie Meaning of the Hawkins-Simon 
Condition 8 
The Closed Model 119 
Exercise 5.7 120 

5.8 Limitations of Static Analysis 120 





PART THREE 
COMPARATIVE-STATIC 
ANALYSIS 123 


Chapter 6 


, Comparative Statics and the Concept 


of Derivative 124 


6.1 The Nature of Comparative Statics 124 
6.2 Rate of Change and the Derivative 125 
The Difference Quotient 125 


The Derivative 126 
Exercise 6.2 127 


6.3 The Derivative and the Slope ofa Curve 128 


6.4 The Concept of Limit 129 
Lefi-Side Limit and Right-Side Limit 129 
Graphical Mustrations 130 
Evaluation ofa Limit 134 
Formal Fiew of the Limit Concept 133 
Exercise 6.4 135 

6.5 Digression on Inequalities and Absolute 

Valucs 136 

Rules of Inequalities 136 
Absolute Values and inequalities 137 
Solution of an Inequality 138 
Exercise 6.5 139 

6.6 LimitTheorems 139 
Theorems Involving a Single Function 139 
Theorems Involving Two Functions 140 
Limit of a Polynomial Function 141 
Exercise 6.6 147 

6.7 Continuity and Differentiability of a 

Function 141 

Continuity ofa Function 149 
Polynomial and Rational Functions 142 
Differentiability ofa Function 743 
Exercise 6.7 146 


Chapter 7 
Rules of Differentiation and Their Use 
in Comparative Statics 148 


7.1 Rules of Differentiation for a Function of 
One Variable 148 
Constant-Function Rule 148 
‘unction Rule 149 
Power-Function Rule Generalized 15} 
Exercise 7.1 152 
7.2 Rules of Differentiation Involving Two or 
More Functions of the Same Variable 152 
Sum-Difference Rule 152 
Product Rule 155 
Finding Marginal-Revenue Function from 





Average-Revenue Function 156 
Quotient Rule 158 

Retationship Between Marginal-Cost and 
Average-Cost Functions 159 

Exercise 7.2 160 


Contents xiii 


7.3 Rules of Differentiation Involving 
Functions of Different Variables 161 

Chain Rule 161 
Frverse-Fuaction Rule 163 
Exercise 7.3 165 

7.4 Partial Differentiation 165 
Partial Derivatives 165 
Techniques of Partial 
Differentiation 166 
Geometric Interpretation of Partial 
Derivatives 167 
Gradient Vector 168 
Exercise 7.4 169 

7.5 Applications to Comparative- Static 

Analysis 170 

Market Model 170 
National-Income Madel {72 
Inpui-Outpul Model 173 
Exercise 7.5 175 

7.6 Note on Jacobian Determinants 175 
Exercise 7.6 177 


Chapter 8 
Comparative-Static Analysis of 
General-Function Models 178 


8.1 Differentials 179 
Differentials and Derivatives 179 
Differentials and Point Blasticity 787 
Exercise 81 184 

8.2 Total Differentials 184 
Exercise 8.2 186 

8.3 Rules of Differentials 187 
Exercise 83 189 

8.4 Total Derivatives 189 
Finding the Total Derivative 189 
A Variation on the Theme 191 
Another Variation on the Theme 192 
Same General Remarks 193 
Exercise 84 193 

8.5 Derivatives of Implicit Functions 194 
Implicit Functions 194 
Derivatives of Inplicit Functions 196 
Extension to the Simudtaneous-Equation 
Case 199 
Exercise 8.5 204 


xiv Contents 


8.6 Comparative Statics of General-Function 
Models 205 

Market Model 205 
Simuttaneous-Equation Approach 207 
Use of Total Derivatives 209 
National-incame Model (IS-LM) 210 
Extending the Model: An Qpen 
Economy 213 
Summutry of the Procedure 116 
Exercise 8.6 217 

8.7 Limitations of Comparative Statics 218 


PART FOUR 
OPTIMIZATION PROBLEMS = 219 


Chapter 9 
Optimization: A Special Variety of 
Equilibrium Analysis 220 


9.1 Optimum Values and Extreme Values 221 
9.2 Relative Maximum and Minimum: 
Fitst-Derivative Test 222 
Relative versus Absolute Extremum 222 
First-Derivative Test 223 
Exercise 9.2 226 
9.3 Second and Higher Derivatives 227 
Derivative of a Derivative 227 
Interpretation of' the Second Derivative 229 
An Application 231 
Attitudes toward Risk 231 
Exercise 9.3 233 
9.4 Second-Derivative Test 233 
Necessary versus Sufficient Conditions 234 
Conditions for Profit Maximization 235 
Coefficients of a Cubic Total-Cost 
Function 238 
Upward-Sloping Marginal-Revenue 
Curve 240 
Exercise 9.4 241 
9.5 Maclaurin and Taylor Series 242 
Maclaurin Series of a Polynomial 
Function 242 
Taylor Series of a Polynomial Function 244 
Expansion of an Arbitrary Function 245 
Lagrange Form ofthe Remainder 248 
Exercise 9.5 250 


9.6 Nth-Derivative Test for Relative Extremum of 
a Function of One Variable 250 
Taylor Expansion and Relative Extremum 250 
Some Specific Cases 251 
Nth-Derivative Test 253 
Exercise 9.6 254 


Chapter 10 
Exponential and Logarithmic 
Functions 255 


10.1 The Nature of Exponential Functions 256 
Simple Exponential Function 256 
Graphical Form 256 
Generalized Exponential Function 257 
A Preferred Base 259 
Exercise 10.7 260 

10,2 Natural Exponential Functions and the 

Problem of Growth 260 
The Number e 260 
An Economic Interpretation of e 262 
Interest Compounding and the Function 
Ae 262 
Instantaneous Rate of Growth 263 
Continuous versus Discrete Growih 265 
Discounting and Negative Growth 266 
Exercise 10.2 267 

10.3 Logarithms 267 
The Meaning of Logarithm 267 
Common Log and Natural Log 268 
Rides of Logarithms 269 
An Application 271 
Exercise 10.3 272 

10.4 Logarithmic Functions 272 
Log Functions and Exponential Functions 272 
The Graphical Form 273 
Base Conversion 274 
Exercise 10.4 276 

10.5 Derivatives of Exponential and Logarithmic 

Functions 277 
Log-Function Rule 277 
Exponential-Funetion Rule 278 
The Rules Generatized 278 
The Case of Base b 280) 
Higher Derivatives 280 
An Application 281 
Exercise 10.5 282 


10.6 Optimal Timing 282 
A Problem af Wine Storage 282 
Maximization Conditions 283 
A Problem of Timber Cutting 285 
Exercise 10.6 286 

10.7 Further Applications of Exponential and 

Logarithmic Derivatives 286 

Finding the Rate of Growth 286 
Rate of Grawth of a Combination of 
Functions 287 
Finding the Point Elasticity 288 
Exercise 1.7 290 


Chapter 11 
The Case of More than One Choice 
Variable 291 


11.1 The Differential Version of Optimization 
Conditions 291 
First-Order Condition 291 
Second-Order Condition 292 
Differential Conditions versus Derivative 
Conditions 293 
11.2 Extreme Values of a Function of Two 
Variables 293 
First-Order Condition 294 
Second-Order Partial Derivatives 295 
Second-Order Total Differential 297 
Second-Order Condition 298 
Exercise 11.2 300 
11.3 Quadratic Forms—An Excursion 301 
Second-Order Total Differential as a Quadratic 
Form 301 
Positive and Negative Definiteness 302 
Determinantal Test for Sign 
Definiteness 302 
Three-Variable Quadratic Forms 305 
a-Variable Quadratic Forms 307 
Characteristic-Root Test for Sign 
Definiteness 307 
Exercise 11.3 342 
11.4 Objective Functions with More than Two 
Variables 313 
First-Order Condition for Exiremum 313 
Second-Order Condition 313 
n-Variable Case 316 
Fxereise 1140 397 








Contents xv 


11.5 Second-Order Conditions in Relation to 
Coneavity and Convexity 318 
Checking Concavity and Convexity 320 
Differentiable Functions 324 
Convex Functions versus Convex Sets 327 
Exercise 11.5 330 
11.6 Economic Applications 331 
Problem of a Multiproduct Firm 331 
Price Discrimination 333 
Input Decisions ofa Firm 336 
Exercise 11.6 34] 
11.7 Comparative-Static Aspects of 
Optimization 342 
Reduced-Form Solutions 342 
General-Fuaction Models 343 
Exercise 11,7 345 


Chapter 12 
Optimization with Equality 
Constraints 347 


12.1 Effects of a Constraint 347 

12.2 Finding the Stationary Values 349 
Lagrange-Multiplier Method 350 
Total-Differential Approach 352 
Adt Interpretation of the Lagrange 
Multiplier 353 
n-Variable and Multiconstraint Cases 354 
Exercise 12.2 355 

12.3 Second-Order Conditions 356 
Second-Order Total Differential 356 
Second-Order Conditions 3357 
The Bordered Hessian 358 
n-Vaviable Case 36} 
Multiconstraint Case 362 
Exercise 12.3 363 

12.4 Quasiconcavity and Quasiconvexity 364 
Geometric Characterization 364 
Algebraic Definition 365 
Differentiable Functions 368 
A Further Look at the Bordered Hessian 371 
Absolute versus Relative Extrema 372 
Exercise 12.4 374 

12.5 Utility Maximization and Consumer 

Demand 374 

First-Order Condition 375 
Second-Order Condition 376 





xvi Contents 


Comparative-Static Analysis 378 
Proportionate Changes in Prices 
and Incunte 381 
Exercise 12.5 382 

12.6 Homogeneous Functions 383 
Linear Homogeneity 383 
Cobb-Douglas Production Function 386 
Extensions of the Resulis 388 
Exercise 12.6 389 

12.7 Least-Cost Combination of Inpuis 390 
First-Order Condition 390 
Second-Order Condition 392 
The Expansion Path 392 
Homothetic Functions 394 
Elasticity of Substitution 396 
CES Production Function 397 
Cobb-Douglas Function as a Special Case of 
the CES Function 399 
Exercise 12.7 40 


Chapter 13 
Further Topics in Optimization 402 


13.1 Nonlinear Programming and Kuhn-Tucker 
Conditions 402 

Step 1: Effect of Nonnegativity 
Restrictions 403 
Step 2: bffect of Inequality Constraints 404 
interpretation of the Kuka-Tucker 
Conditions 408 
The n-Variable, m-Constraint Case 409 
Exercise 13.1 417 

13.2 The Constraint Qualification 412 
Trregutarities at Boundary Points 412 
The Constraint Qualification 415 
Linear Constraints 416 
Exercise 13.2 418 

13.3 Economic Applications 418 
War-Time Rationing 418 
Peak-Loud Pricing 420 
Exercise 13.3 423 

13.4 Sufficiency Theorems in Nonlincar 

Programming 424 

The Kuhan-Tucker Sufficiency Thearem: 
Concave Programming 424 


The Arrow-Enthoven Sufficiency Theorem: 
Quasiconcave Programming 425 
A Constraint-Qualification Test 426 
Exercise 13.4 427 
13.5 Maximum-Value Functions and the 
Envelope Theorem 428 
The Envelope Theorem for Unconstrained 
Optimization 428 
The Profit Function 429 
Reciprocity Conditions 430 
The Envelope Thearem for Constrained 
Opiimization 432 
dnterpretation of the Lagrange Multiplier 434 
13.6 Duality and the Envelope Theorem 435 
The Primal Problem 435 
The Dual Problem 436 
Duality 436 
Roy’ identity 437 
Shephard’s Lemma 438 
Fxercise 13.6 441 
13.7 Some Concluding Remarks 442 





PART FIVE 
DYNAMIC ANALYSIS 443 


Chapter 14 
Economic Dynamics and Integral 
Caleulus 444 


14.1 Dynamics and Integration 444 
14.2 Indefinite Integrals 446 
The Nature of Integrals 446 
Basic Rules of Integration 447 
Rules of Operation 448 
Rufes Invalving Substitution 451 
Exercise 14.2 453 
14.3 Definite Integrals 454 
Meaning of Definite Integrals 454 
A Definite Integral as an Area under a 
Curve 455 
Some Properties of Definite Integrals 458 
Another Look at the indefinite 
Integral 460 
Exercise 14.3 460 


14.4 Improper Integrals 461 
Infinite Limits of Integration 461 
Infinite Integrand 463 
Exercise 14.4 464 

14.5 Some Economic Applications of 

Integrals 464 

From a Marginal Function to a Total 
Function 464 
Investment and Capital Formation 465 
Present Value ofa Cash Flow 468 
Present Value ofa Perpetual Flow 470 
Exercise 14.5 470 

14.6 Domar Growth Model 471 
The Framework 471 
Finding the Solution 472 
The Razor's Edge 473 
Exercise 14.6 474 


Chapter 15 
Continuous Time: First-Order 
Differential Equations 475 


15.1 First-Order Linear Differential Equations 
with Constant Cocfficient and Constant 
Term 475 

The Homogeneous Case 476 
The Nonhomogeneous Case 476 
Verification of the Solution 478 
Exercise 15.1 479 

15.2 Dynamics of Market Price 479 
The Framework 480 
The Time Path 480 
The Dynamic Stability of Equilibrium 481 
An Alternative Use of the Model 482 
Exercise 15.2 483 

15.3 Variable Coefficient and Variable Term 483 
The Homogeneous Case 484 
The Nonhomogencous Case 489 
Exercise 15.3 486 

15.4 Exact Differential Equations 486 
Exact Differential Equations 486 
Methad of Solution 487 
integrating Factor 489 
Solution of First-Order Linear Differential 
Equations 490 
Exercise 15.4 491 


Contents xvii 


15.5 Nonlincar Differential Equations of the First 
Order and First Degree 492. 
Exact Differential Equations 492 
Separable Variables 492 
Equations Reducible ta the Linear Form 493 
Exercise 15.5 495 
15.6 The Qualitative-Graphic Approach 495 
The Phase Diagram 495 
Types of Time Path 496 
Exercise 15.6 498 
15.7 Solow Growth Model 498 
The Framework 498 
A Qualitative-Graphic Analysis 500 
A Quantitative Hlustration SOL 
Exercise 15.7 302 


Chapter 16 
Higher-Order Differential 
Equations 503 


16.1 Second-Order Linear Differential Equations 
with Constant Coefficients and Constant 
Term 504 

The Particular Integral 504 

the Complementary Function SOS 

The Dynamic Stability of Equilibrium 540 
Exercise 16.1 S11 

16.2 Complex Numbers and Circular 

Functions 511 
imaginary and Complex Numhers S11 
Complex Roots 512 
Circular Functions 543 
Properties of the Sine and Cosine 
Functions S75 
Euler Relations 517 
Alternative Representations of Complex 
Numbers 519 
Exercise 16.2 521 

16.3 Analysis of the Complex-Root 

Case 522 
The Complementary Funetion 522 
An Example of Solution 524 
The Time Path = 325 
The Dynamic Stability of 
Equilibrium 527 
Exercise 16.3 527 


xvili_ Contents 


16.4 A Market Model with Price 
Expectations 527 
Price Trend and Price Expectations 527 
A Simplified Model 528 
The Time Path of Price 529 
Exercise 16.4 332 
16.5 The Interaction of Inflation and 
Unemployment 532 
The Phillips Relation 532 
The Expectations-Augmented Phillips 
Relation 533 
The Feedback fiom Inflation to 
Unemployment 534 
The Time Path of 534 
Exercise 16.5 537 
16.6 Differential Equations with a Variable 
Term 538 
Method of Undetermined Coefficients. 538 
A Modification 539 
Exercise 16.6 540 
16.7 Higher-Order Linear Differential 
Equations $40 
Finding the Solution 340 
Convergence and the Routh Theorem 342 
Exercise 16.7 543 


Chapter 17 
Discrete Time: First-Order Difference 
Equations 544 


17.1 Discrcte Time, Differences, and Difference 
Equations 544 
17.2 Solving a First-Order Difference 
Equation 546 
dterative Method 546 
General Method 348 
Exercise 17.2 537 
17.3 The Dynamic Stability of Equilibrium 551 
The Significance of 6 551 
The Role of A 553 
Convergence to Equilibrium 554 
Exercise 17.3 554 
17.4 The Cobweb Model 555 
The Model 355 
The Cobwebs 556 
Exercise 17.4 558 


17.5 A Market Model with Inventory 559 
The Model 559 
The Time Path 360 
Graphical Summary of the Results 561 
Exercise 17.5 562 

17.6 Nonlinear Difference Equations: 

The Qualitative-Graphie Approach 562 

Phase Diagram 562 
Types of Time Path 564 
A Market with a Price Ceiling 565 
Exercise 17.6 567 


Chapter 18 
Higher-Order Difference Equations 568 


18.1 Sccond-Order Linear Difference Equations 
with Constant Coefficients and Constant 
Term 569 

Particular Solution 569 
Cumplementary Function 370 

The Convergence of the Time Path 573 
Exercise 18.2 575 

18.2 Samuelson Multiplicr-Acceleration 

Interaction Model 576 
The Framework 576 
The Solution 377 
Convergence versus Divergence 578 
A Graphical Summary 380 
Exercise 18,2 381 

18.3 Inflation and Unemployment in Discrete 

Time 581 
The Model 58} 
The Difference Equation inp 582 
The Time Path ofp 583 
The Anatysis af U 384 
The Long-Run Phillips Relation 385 
Frercise 18.3 585 

18.4 Generalizations to Variable-Term and 

Higher-Order Equations 386 
Variable Term in the Formofcm' 586 
Variable Term in the Form ct" 587 
Higher-Order Linear Difference 
Equations 388 
Convergence and the Schur 
Theorem 389 
Exercise 18.4 59] 


Chapter 19 
Simultaneous Differential Equations and 
Difference Equations 592 


19.1 The Genesis of Dynamic Systems 592 
Interacting Patterns of Change 592 


The Transformation of a High-Order Dynamic 


Equation 593 
19.2 Solving Simullaneaus Dynamic 
Equations 594 
Simultaneous Difference Equations 394 
Matrix Notation 596 
Simultaneous Differential Equations 599 
Further Comments on the Characteristic 
Equation 604 
Exercise 19.2 602 
19.3 Dynamic Input-Output Models 603 
Time Lag in Production 603 
Excess Demand and Output Adjustment 605 
Capital Formation 607 
Exercise 19.3 608 
19.4 The Inflation-Unemployment Model Once 
More 609 
Simultaneous Differential Equations 610 
Solution Paths 610 
Simultaneous Difference Equations 612 
Solution Paths 613 
Exercise 19.4 614 
19.5 Two-Variable Phase Diagrams 614 
The Phase Space 615 
The Demarcation Curves 615 
Streamlines 617 
Byes of Equilibrium 618 
Inflation and Monetary Rule @ la Obst 620 
Exercise 19.5 623 
19.6 Linearization of a Nonlincar Differential- 
Equation System 623 
Taylor Expansion and Linearization 624 


Contents xix 


The Reduced Linearization 625 
Local Stability Analysis. 625 
Exercise [9.6 629 


Chapter 20 
Optimal Control Theory 631 


20.1 The Nature of Optimal Control 631 
Hlustration: A Simple Macroeconomic 
Model 632 
Pontryagin 's Maxinueon Principle 633 

20.2 Atternative Terminal Conditions 639 
Fixed Terminal Point 639 
Horizontal Terminal Line 639 
Truncated Vertical terminal Line 639 
Truncated Horizontal Terminal Line 640 
Exercise 20.2 643 

20.3 Autonomous Problems 644 

20.4 Economic Applications 645 
Lifetime Utility Masimization 645 
Exhaustible Resource 647 
Exercise 20,4 649 

20.5 Infinite Time Horizon 649 
Neoclassical Optimal Growth Model 649 
The Current-Value Hamitionian 651 
Constructing a Phase Diagram 652 
Analyzing the Phase Diagram 653 

20.6 Limitations of Dynamic Analysis 654 


The Greek Alphabet 655 
Mathematical Symbols 656 

A Short Reading List 659 
Answers to Selected Exercises 662 
Index 677 





McGraw-Hill 
Irwin 


FUNDAMENTAL METHODS OF MATHEMATICAL ECONOMICS, 


Published by McCiraw-Hill/Lrwin, a business unit of The McGraw-Hill Canpanics, Ine., 

1221 Avenue of the Americas, New York, NY, 10020, Copyright © 2005, 1984, 1974, 1967 by 

The MeCiraw-Hill Companies, Inc. All rights reserved. No part of this publication may be reproduced or 
distributed in any form or by any means, or stored in a database or retrieval system, without the prior 
written consent of The McGraw-Hill Companis, Inc., including, but not limited to, in any network or 
other electronic storage or transmission, or broadcast for distance learning, 


Some ancillaries, including electronic and print components, may not be available to customers outside the United States, 
This book is primed on acidefree paper. 

1234567890 DOC/DOC O98 7654 

ISBN 0-07-010910-9 


About the cover: The praph in Figure 20.1 on page 634 illustrates that the shortest distance belween two points is a 
straight line. We chose it as the basis for the cover design because such a simpte truth requires onc of the most 
advanced techniques found in this book. 


Publisher; Gary Burke 

Executive editor: Lucille Sutton 
Developmental editor: Reheeca Hicks 
Editorial assistant: Jackie Grabel 

Senior marketing manager: Mardin D. Quinn 
Senior media producer: Kai Chiang 

Project manager: Bruce Gist 

Production supervisor: Debra R. Sylvester 
Designer: Kami Carter 

Supplement producer; Lynn M. Bhulim 
Senior digital content specialist: Brian Nacik 
Cover design: Kami Carter 

Typeface: 10/12 Thmes New Roman 
Compositor: Jnteractive Composition Corporation 
Printer: &. R, Donnelley 


Library of Congress Cataloging~in-Publication Data 


Chiang, Alpha .. 1927- 
Fundamental methods of mathematical economics / Alpha C. Chiang, Kevin 
Wainwright —4th cd. 
pcm. 
Includes bibliographical references and index. 
ISBN 0-07-010910-9 ¢alk. paper) 
L, Eoonomics, Mathematical, L, Wainwright, Kevin, 1, ‘litle. 
1B 135.C47 2005 
330.01'S1 de22 
2004059546 
wwwumbhe.cont 


Part 


Introduction 





Chapter 





The Nature of 
Mathematical Economics 


Mathematical economics is not a distinct branch of economics in the sense that public fi- 
nance or international trade is, Rather, it is an approach to economic analysis, in which the 
economist makes use of mathematical symbols in the statement of the problem and also 
draws upon known mathematical theorems to aid in reasoning. As far as the specific sub- 
ject matter of analysis goes, it can be micro- or macroeconomic theory, public finance, 
urban economics, or what not, 

Using the term mathematical economics in the broadest possible sense, one may very 
well say that every elementary textbook of economics today exemplifies mathematical cco- 
nomics insofar as geometrical methods are frequently utilized to derive theoretical results. 
More commonily, however, mathematical economics is reserved to describe cases employ- 
ing mathematical techniques beyond simple gcometry, such as matrix algebra, differential 
and integral calculus, differential equations, difference equations, etc. [¢ is the purpose of 
this book to introduce the reader to the most fundamental aspects of these mathematical 
methods: those encountered daily in the current economic literature. 


1.1 Mathematical versus Nonmathematical Economics 





Since mathematical economics is merely an approach to economic analysis, it should not 
and does not fundamentally differ from the zonmathematical approach to economic analy- 
sis. The purpose of any theoretical analysis, regardless of the approach, is always to derive 
a set of conclusions or theorems from a given set of assumptions or postulates via a process 
of reasoning. The major differenee between “mathematical economics” and “literary eco- 
nomics” is twofold: First, in the former, the assumptions and conclusions are stated in 
mathematical symbols rather than words and in equations rather than sentences, Second, 
in place of literary logic, use is made of mathematical theorems—of which there exists an 
abundance to draw upon—in the reasoning process. Inasmuch as symbols and words are 
really cquivatents (witness the fact that symbols are usually defined in words), it matters lit- 
tle which is chosen over the other. But it is perhaps beyond dispute that symbols are more 
convenient to usc in deductive reasoning, and certainly are more conducive to conciseness 
2 and precisencss of statement. 


Chapter 1 The Nuture of Mathematical Economies 3 


The choice between literary logic and mathematical logic, again, is a mater of little im- 
port, but mathematics has the advantage of forcing analysts to make their assumptions cx- 
plicit at every stage of reasoning. This is because mathematical theorems are usually stated 
in the “if-then” form, so that in order to tap the “then” (result) part of the theorem for their 
use, they must first make sure that the “if (condition) part does conform to the explicit 
assumptions adopted. 

Granting these points, though, one may still ask why it is necessary to go beyond geo- 
metric methods. The answer is that while geometric analysis has the important advantage 
of being visual, it also suffers from a serious dimensional limitation, In the usual graphical 
discussion of indifference curves, for instance, the standard assumption is that only two 
commodities are available to the consumer. Such a simplifying assumption is not willingly 
adopted but is forced upon us because the task of drawing a three-dimensional graph is ex- 
ceedingly difficult, and the construction ofa four- (or higher) dimensional graph is actually 
a physical impossibility. To deal with the more general case of 3, 4, or » goods, we must 
instead resort to the more flexible tool of equations. This reason alonc should provide suf- 
ficient motivation for the study of mathematical methods beyond geometry. 

In short, we sce that the mathematical approach has claim to the following advantages: 
(1) The “language” used is more concise and precise; (2) there exists a wealth of mathe- 
matical theorems at our service; (3) in forcing us to state explicitly all our assumptions as 
a prerequisite to the use of the mathematical theorems, it keeps us from the pitfall of an un- 
intentional adoption of unwanted implicit assumptions; and (4) it allows us to treat the 
general n-variable case. 

Against these advantages, onc sometimes hears the criticism that a mathematically de- 
rived theory is inevitably werealistic. However, this criticism is not valid. In fact, the epithet 
“unrealistic” carmot even be used in criticizing economic theory in general, whether or not 
the approach is mathematical. Theory is by its very nature an abstraction ftom the real 
world. Itis a device for singling out only the most essential factors and relationships so that 
we can study the crux of the problem at hand, free from the many complications that do 
exist in the actual world. Thus the statement “theory lacks realism” is merely a truism that 
cannot be accepted as a valid criticism of theory. By the same token, it is quite meaningless 
to pick out any one approach to theory as “unrealistic.” For example, the theory of firm 
under pure competition is unrealistic, as is the theory of firm under imperfect competition, 
but whether these theories are derived mathematically or not is irrelevant and immaterial. 

To take advantage of the wealth of mathematical tools, onc must of course first acquire 
those tools. Unfortunately, the tools that are of interest to economists are widely scattered 
among many mathematics courses—too many to fit comfortably into the plan of study of a 
typical economics student. The service the present volume performs is to gather in one 
place the mathematical methods most relevant to the economics litcrature, organize them 
into a logical order of progression, fully explain each method, and then immediately illus- 
trate how the method is applied in economic analysis. By tying together the methods 
and their applications, the relevance of mathematics to economics is made more transpar- 
ent than is possible in the regular mathematics courses where the illustrated applications 
are predominantly tied to physics and engineering. Familiarity with the contents of this 
book (and, if possible, also its sequel volume: Alpha C. Chiang, Elements of Dynamic 
Optimization, McGraw-Hill, 1992, now published by Waveland Press, Inc.) should there- 
fore enable you to comprehend most of the professional articles you will come actoss in 





4 PartOne futraduction 


such periodicals as the American Economic Review, Quarterly Journal of Economics, 
Journal of Political Economy, Review of Economics and Statistics, and Economic Journal. 
Those of you who, through this exposure, develop a scrious interest in mathematical 
economics can then proceed to a more rigorous and advanced study of mathematics. 


1.2. Mathematical Economics versus Econometrics 





The term mathematical economics is sometimes confused with a related term, economet- 
rics. As the “metric” part of the latter term implics, econometrics is concerned mainly with 
the measurement of economic data. Hence it deals with the study of empirical observations 
using statistical methods of estimation and hypothesis testing. Mathematical economics, on 
the other hand, refers to the application of mathematics to the purely theoreticad aspects of 
economic analysis, with little or no concern about such statistical problems as the errors of 
measurement of the variables under study, 

In the present volume, we shail confine oursclvcs to mathematical economics. That is, 
we shall concentrate on the application of mathematics to deductive reasoning rather than 
inductive study, and as a result we shall be dealing primarily with theoretical rather than 
empirical material, This is, of course, solely a matter of choice of the scope of discussion, 
and it is by no means implicd that econometrics is less important. 

Indeed, empirical studies and theoretical analyses are often complementary and mutu- 
ally reinforcing, On the one hand, theories must be tested against empirical data for valid- 
ity before they can be applied with confidence. On the other, statistical work needs 
economic theory as a guide, in order to determine the most relevant and fruitful direction 
of research. 

In one sense, however, mathematical economics may be considered as the more basic of 
the two: for, to have a meaningful statistical and econometric study, a good theoretical 
framework—preferably in a mathematical formulation—is indispensable. Hence the 
subject matter of the present volume should be useful not only for those interested in theo- 
retical economics, but also for those seeking a foundation for the pursuit of econometric 
studies. 





em 








Chapter 








Economic Models 


As mentioned before, any economic theory is necessarily an abstraction from the real 
world. For one thing, the immense complexity of the real economy makes it impossible for 
us to understand all the interrelationships at once; nor, for that matter. are all these iterre- 
lationships of equal importance for the understanding of the particular economic phenom- 
enon under study, The sensible procedure is, therefore, to pick out what appeals to our 
reason to be the primary factors and relationships relevant to our problem and to focus our 
attention on these alone. Such a deliberately sitnplificd analytical framework is called an 
economic model, since it is only a skeletal and rough representation of the actual economy. 


2.1_ Ingredients of a Mathematical Model 





An economic model is merely a theoretical framework, and there is no inherent reason why 
it must be mathematical. If the madel is mathematical, however, it will usually consist of a 
Sct of equations designed to describe the structure of the model. By relating a number of 
variables to one another in certain ways, these equations give mathematical form to the sct 
of analytical assumptions adopted. Then, through application of the relevant mathematical 
operations to these equations, we may scck to derive a set of conclusions which logically 
follow from those assumptions. 


Variables, Constants, and Parameters 

A variable is something whose magnitude can change, i.c., something that can take on dif- 
ferent values. Variables frequently used in economics include price, profit, revenue. cost, 
national income, consumption, investment, imports, and exports. Since each variable can 
assume various valucs, it must be represented by a symbol instead of a specific number. For 
example, we may represent price by P, profit by x, revenue by 2, cost by C, national in- 
come by Y, and so forth, When we write P = 3 or C = 18, however, we are “freezing” 
these variables at specific values (in appropriately chasen units). 

Properly constructed, an economic model can be solved to give us the solition values 
of a certain sct of variables, such as the market-clearing level of price, or the profit- 
maximizing level of output. Such variables, whose solution values we seck from the model, 
are known as endogenous variables (originating from within). However, the model may 
also contain variables which are assumed to be determined by forces external to the model, 





5 


& Part One Jntroduction 


and whose magnitudes are accepted as given data only; such variables are called exogenous 
variables (originating from without). It should be noted that a variable that is endogenous 
to one model may very well be exogenous to another. In an analysis of the market determi- 
nation of wheat price (P), for instance, the variable P should definitely be endogenous; but 
in the framework of a theory of consumer expenditure, P would become instead a datum to 
the individual consumer, and must therefore be considered exogenous, 

Variables frequently appear in combination with fixed numbers or constants, such as 
in the expressions 7P or 0.58. A constant is a magnitude that docs not change and is there- 
fore the antithesis of a variable. When a constant is joined to a variable, il is often referred 
to as the coefficient of that variable. However, a coefficient may be symbolic rather than 
numerical. We can, for instance, let the symbol a stand for a given constant and use the 
expression aP in lieu of 7P in a model, in order tw attain a higher level of generality 
(see Sec, 2.7). This symbol a is a rather peculiar case—it is supposed to represent a given 
constant, and yet, since we have not assigned to it a specific number, it can take virtually 
any value, In short, itis a constant that is variable! To identify ils special status, we give it 
the distinctive name parametric constant {or simply parameter). 

It must be duly emphasized that, although different values can be assigned to a paramc- 
ter, it is nevertheless to be regarded as a datum in the model. It is for this reason that peo- 
ple sometimes simply say “constant” even when the constant is parametric. In this respect, 
parameters closely resemble exogenous variables, for both are to be treated as “givens” in 
a model. This explains why many writers, for simplicity, refer to both collectively with the 
single designation “parameters.” 

Ag a matter of convention, parametric constants are normally represented by the sym- 
bols a, }, ¢, or their counterparts in the Greck alphabet: a, 6, and y. But other symbols nat- 
urally are also permissible. As for cxogcnous variables, in order that they can be visually 
distinguished from their endogenous cousins, we shall follow the practice of attaching a 
subscript 0 to the chosen symbol. For example, if P symbolizes price, then Py signifies an 
exogenously determined price. 











Equations and Identities 
Variables may exist independently, but they do not really become interesting until they are 
related to one another by equations or by inequalities. At this moment we shall discuss 
equations only. 

In economic applications we may distinguish between three types of cquation: defini- 
tional equations, behavioral equations, and conditional equations. 

A definitional equation up an identity between two alternate expressions that have 
exactly the same meaning. For such an equation, the identical-equality sign = (read: “is 
identically equal to”) is often employed in place of the regular equals sign =, although the 
latter is also acceptable. As an cxample, lotal profit is defined as the exccss of total revenue 
over total cost; we can therefore write 











m=R-C 


A behavioral equation, on the other hand, specifies the manner in which a variable be- 
haves in response to changes in other variables. This may involve either human behavior 
(such as the aggregate consumption pattern in relation to national income) or nonhuman 
behavior (such as how total cost of a firm re: to ouiput changes). Broadly defined, 





Chapter 2 Economic Medels 7 


behavioral equations can be used to describe the gencral institutional setting of a model, in- 
cluding the technological {c.g., production function) and legal (e.g., tax structure) aspects. 
Before a behavioral equation can be written, however, it is always necessary to adopt defi- 
nite assumptions regarding the behavior pattern of the variable in question. Consider the 
two cost functions 





C= 75+ 100 (2.1) 
C=10+@? (2,2) 


where Q denotes the quantity of output. Since the two equations have different forms. the 
production condition assumed in cach is obviously different from the other. In (2.1), the 
fixed cost (the value of C' when Q = 0) is 75, whereas in (2.2) it is 110. The variation in cast 
is also different, In (2.1), for each unit increase in Q, there is a constant increase of 10 in C. 
But in (2.2), as Q increases unit after unit, C will increase by progressively larger amounts. 
Clearly, it is primarily through the specification of the form of the behavioral equations that 
we give mathematical expression to thc assumptions adopted for a model, 

As the third type, a conditional equation states a requirement to be satisfied. For exam- 
ple, in a model involving the notion of equilibrium, we must set up an equilibrium condi- 
fion, which describes the prerequisite for the attainment of equilibrium. Two of the most 
familiar equilibrium conditions in economics are 





Qa = Qs — [quantity demanded = quantity supplied] 
and S=7 [intended saving = intended investment] 


which pertain, respectively, to the equilibrium of a market model and the equilibrium of the 
national-income model in its simplest form, Similarly, an optimization model either derives 
or applics one or more optimization conditions. One such condition that comes easily to 
mind is the condition 

MC=MR [marginal cost = marginal revenue] 


in the theory of the firm. Because equations of this type are neither definitional nor behay- 
ioral, they constitute a class by themsclves. 


2.2 The Real-Number System 





Equations and variables are the essential ingredients of a mathematical model. But since 
the values that an economic variable takes are usually numerical, a few words should be 
said about the number system. Here, we shall deal only with so-called real numbers. 

Whole numbers such as |, 2, 3, ... arc called positive integers; these are the numbers 
most ftequently used in counting. Their negative counterparts -1,—-2, ~3,... are called 
negative integers; these can be employed, for example, to indicate subzcro temperatures (in 
degrees). The number 0 (zero), on the other hand, is neither positive nor negative, and is in 
that sense unique. Let us lump all the positive and negative integers and the number zero 
into a single category, referring to them collectively as the set of alf integers. 

Integers, of course, do not exhaust all the possible numbers, for we have fractions, such 

3 


as §, 7.and i, which—if placed on a ruler—would fall between the integers. Also, we have 


negative fractions, such as ~t and -} . Together, these make up the set of all fractions. 





8 Part One introduction 


FIGURE 2,1 






Fractions & 


Irrational 


The common property of all fractional numbers is that cach is expressible as a ratio of 
two integers. Any number that can be expressed as a ratio of two integers is called a ratio- 
nal number, But integers themselves are also rational, because any integer n can be consid- 
cred as the ratio n/1. The sct of all integers and the set of all fractions together form the sez 
ofall rational numbers. An alternative defining characteristic of a rational number is that it 
is expressible as cither a terminating decimal (c.g., q = 0.25) or a repeating decimal (c.g., 
; = 0.3333... .), where some tumber or scries of numbers to the right of the decimal point 
is repeated indefinitely. 

Once the notion of rational numbers is used, there naturally arises the concept of irra- 
tional numbers—numbers that cannot be expressed as ratios of a pair of integers. One ex- 
ample is the number /2 = 1.4142..., which is a nonrepeating, nonterminaling decimal. 
Another is the special constant 7 = 3.1415... (representing the ratio of the circumference 
of any circle to its diameter), which is again a nonrepeating, nonterminating decimal, as is 
characteristic of all irrational numbers. 

Each irrational number, if placed on a ruler, would fall between two rational numbers, 
80 that, just as the fractions fill in the gaps between the integers on a tuler, the irrational 
numbers fill in the gaps between rational numbers. The result of this filling-in process is a 
continuum of numbers, all of which are so-called real numbers. This continuum constitutes 
the set of all real numbers, which is often denoted by the symbol R. When the set R is dis- 
played on a straight line (an extended ruler), we refer to the line as the real fine. 

In Fig. 2.1 are listed {in the order discussed) all the number sets, arranged in relationship 
to one another. If we read from bottom to top, however, we find in effect a classificatory 
scheme in which the set of real numbers is broken down into its component and subcom- 
ponent number sets. This figure therefore is a summary of the structure of the real-number 
system, 

Real numbers are all we need for the first 15 chapters of this book, but they are not the 
only numbers used in mathematics, In fact, the reason for the term rea/ is that there are also 
“imaginary” numbers, which have to do with the square roots of negative numbers. That 
concept will be discussed later, in Chap. 16. 


2.3 The Concept of Sets 














We have already employed the word set several times. Inasmuch as the concept of scts 
underlies every branch of modem mathematics, it is desirable to familiarize ourselves at 
least with ils more basic aspects. 


Chapter 2 Economic Models % 


Set Notation 
A set is simply a collection of distinct objects. These objects may be a group of (distinct) 
numbers, persons, food items, or something else. Thus, all the students enrolled in a par- 
ticular economics course can be considered a set, just as the three integers 2,3, and 4 can 
form a set. The objects in a set are called the efements of the set. 

There are two alternative ways of writing a set: by enumeration and by description. If 
we let 5 represent the set of three numbers 2, 3, and 4, we can write, by enumeration of the 
elements, 


S= (2,3, 4} 


But if we let / denote the set of al! positive integers, cnumeration becomes difficult, and we 
may instead simply describe the elements and write 


T = {x | x a positive integer} 


which is read as follows: “s is the set of all (numbers) x, such that x is a positive integer.” 
Note that a pair of braces is used to enclose the set in either casc. In the descriptive 
approach, a vertical bar (or a colon) is always inserted to separate the gencral designating 
symbol for the elements from the description of the elements, As another cxample, the 
set of all real numbers greater than 2 but less than 5 (call it /) can be expressed symbali- 
cally as 


J={x|2<x <5} 


Here, even the descriptive statement is symbolically expressed. 

A set with a finite number of clements, exemplified by the previously given set S, is 
called a finite set. Set /and set J, each with an infinite number of clements, are, on the other 
hand, examples of an infinite set. Finite sets are always denumerable (or countable), ie., 
their elements can be counted one by one in the sequence 1, 2, 3,. .. . Infinite sets may, 
however, be either denumerable (set /), or nondenumerable (sct.J). In the latter case, there 
is no way to associate the elements of the set with the natural counting numbers 1,2,3,..., 
and thus the set is not countable. 

Membership in a set is indicated by the symbol € (a variant of the Greek letter epsilon € 
for “element”, which is read as follows: “is an element of.” Thus, for the two sets S and 7 
defined previously, we may write 


268 3eS 8el Gel (cte.} 


but obviously 8 ¢ 5 (read: “8 is not an element of set S”). If we use the symbol 2 to denote 
the set of all real numbers, then the statement “x is some real number” can be simply 
expressed by 


xER 


Relationships between Sets 


When two sets are compared with each other, several possible kinds of relationship may be 
observed. If two scts 5; and S$; happen to contain identical elements, 


Sy ={2.74, f} and = 8) ={2,4,7, f} 


10 Part One Introduction 


then S; and Sy are said to be equa? (S, = Sp). Note that the order of appearance of the ¢le- 
ments in a set is immaterial. Whenever we find even one element to be different in any two 
sets, however, those two sets arc not equal. 

Another kind of set relationship is thai one set may be a subset of another set. If we have 
two sets 


§={1,3,5,7,9) and T={3,7} 


then Tis a subset of S, because every clement of T is also an element of S. A more formal 
statement of this is: T is a subset of S if and only if x < 7 implies x € S$. Using the set 
inclusion symbols C (is contained in) and > (includes), we may then write 

ToS oo SOT 


I is possible that two given sets happen to be subscts of each other. When this occurs, how- 
ever, we can be sure that these two sets are equal. To state this formally: we can have 
5; CS, and Sz C S; if and only if S$; = $2. 

Note that, whereas the € symbol relates an individual efement to a set, the C symbol re- 
lates a subset to a set. As an application of this idea, we may state on the basis of Fig. 2.1 
that the set of all integers is a subset of the sct of all rational numbers. Similatly, the set of 
all rational numbers is a subset of the set of all real numbers. 

How many subsels can be formed from the five elements in the set § = (1.3, 5, 7, 9}? 
First of all, each individual element of S can count as a distinct subset of S, such as {1} and 
£3}. But so can any pair, triple, or quadruple of these elements, such as {1,3}, {1.5}, and 
£3, 7,9}. Any subset that does nor contain aiff the elements of Sis called a proper subset of 
S. But the set J itself (with all its five elements} can alse be considered as one of its own 
subsets—every element of $ is an clement of S, and thus the sct S itself fulfills the defini- 
tion of a subset. This is, of course, a limiting case, that from which we get the largest pos- 
sible subsct of S, namely, S itself. 

At the other extreme, the smallest possible subset of 5 is a set that contains no element 
atall, Such a set is called the ui! sez, or empty set, denoted by the symbol Z or { }. The rea- 
son for considering the null set as a subset of S is quite interesting: If the null set is not a 
subset of S(@ ¢ S), then @ must contain at least one element x such that x ¢ 8. But since 
by definition the null set has no clement whatsoever, we cannot say that @ ¢ S; hence the 
null set is a subset of S. 

It is extremely important to distinguish the symbol @ or {} clearly from the notation 
{0}; the former is devoid of elements, but the latter docs contain an element, zero, The null 
sct is unique; there is only one such set in the whole world, and it is considered a subset of 
any set that can be conceived. 

Counting all the subsets of S, including the two limiting cases § and @, we find a total 
of 2° = 32 subsets. In general, if'a set has # elements, a total of 2” subsets can be formed 
from those elements," 











t Given a set with n elements (@, b,c, ..-, n} we may first classify its subsets into two categories: one 
with the element @ in it, and one without. Each of these two can be further classified inte two 
subcategories: one with the element b in it, and one without. Note that by considering the second 
element b, we double the number of categories in the classification from 2 to 4 (= 2°). By the same 
token, the consideration of the element c will increase the total number of categories to 8 (= 2°). 
When all 7 elements are considered, the total number of categories will become the total number of 
subsets, and that number is 2". 


Example 1 


Example 2 


Example 3 


Example 4 


Chapter 2 Economic Models 11 


As a third possible type of set relationship, two sets may have no elements in common 
at all. In that case, the two sets arc said to be dis/oint. For example, the set of all positive in- 
tegers and the set of all negative integers are mutually exclusive; thus they are disjoint sets. 

A fourth type of relationship occurs when two sets have some elements in common but 
some elements peculiar ta cach, In that event, the two sets are neither equal nor disjoint; 
also, neither set is a subsct of the other. 


Operations on Sets 
When we add, subtract, multiply, divide, or take the square root of some numbers, we are 
performing mathematical operations. Although sets are different from numbers, one can 
similarly perform certain mathematical opcrations on them. Three principal operations to 
be discussed here involve the union, intersection, and complement of sets. 

To take the unten of two sets A and B means to form a new set containing those elements 
{and only those elements) belonging to 4, or to B, orto both 4 and #. The union set is sym- 
bolized by 4 U B (read: “4 union B”). 


If A = (3, 5, 7} and B= (2, 3, 4, 8}, then 
AUB =(2,3,4,5,7, 8} 


This example, incidentally, illustrates the case in which two sets A and 8 are neither equal 
hor disjoint and in which neither is a subset of the other. 


Again referring to Fig. 2.1, we see that the union of the set of all integers and the set of all 
fractions is the set of all rational numbers. Similarly, the union of the rational-number set 
and the irrational-number set yields the set of alt real numbers. 


The intersection of two sets 4 and B, on the other hand, is a new set which contains those 
elements (and only those elements) belonging to both A and B. The intcrscction set is sym- 
bolized by 49 & (read: “A intersection B”). 


From the sets A and Bin Example 1, we can write 
AN B= {3} 


If A = (-3,6, 10} and B = {9, 2, 7, 4}, then Am B = ©. Set A and set 8 are disjoint; there- 
fore their intersection is the empty set—no element is common to A and 8. 


It is obvious that intersection is a more restrictive concept than union. In the former, 
only the elements common ta. A and B are acceptable, whereas in the latter, membership in 
either A or B is sufficient to establish membership in the union set. The operator symbols 
M and U—which, incidentally, have the same kind of general status as the symbols ./. +, 
+, etc.—therefore have the connotations ‘“‘and” and “or,” respectively. This point can be 
better appreciated by comparing the following formal definitions of intersection and union: 


Intersection: ANB=([x|xeA and xe B} 


Union: AUB={x|xeEA or xe B} 


12 Part One Introduction 


FIGURE 2.2 


Example 5 


Example 6 





Union Intersection Complement 


AUB ANB 














@ i) tc) 


What about the complement of a sct? To explain this, let us first introduce the concept of 
the wriversal set. In a particular context of discussion, if the only numbers used are the set 
of the first seven positive integers, we may refer to it as the universal set U. Then, with a 
given set, say, A = (3, 6, 7}, we can define another set A (read; “the complement of 4”) as 
the set that contains all the numbers in the universal set U that arc not in the set 4. That is, 


Azs{x|xeU and v¢ A}={1,2,4,5} 


Note that, whereas the symbol U has the connotation “or” and the symbol M means “and,” 
the complement symbol ~ carries the implication of “not.” 


IU = (5, 6, 7, 8, 9} and A = {5, 6}, then A = {7, 8, 9}. 


What is the complement of U? Since every object (number) under consideration is included 
in the universal set, the complement of U must be empty. Thus U = @. 


The three types of set opcration can be visualized in the three diagrams of Fig. 2.2, 
known as Jenn diagrams. In diagram a, the points in the upper circic form a set.4, and the 
points in the lower circle form a sct 8. The union of A and & then consists of the shaded area 
covering both circles. In diagram 4 are shown the same two sets (circles). Since their inter- 
section should comprise only the points common to both sets, only the (shaded) overlap- 
ping portion of the two circles satisfies the definition. In diagram c, let the points in the 
rectangle be the universal sct and let 4 be the set of points in the circle; then the comple- 
ment set A will be the (shaded) area outside the circle. 


Laws of Set Operations 


From Fig. 2.2, il may be noted that the shaded area in diagram a represents not only 
AUB but also BUA. Analogously, in diagram 4 the small shaded area is the visual 


FIGURE 2.3 


Example 7 


Chapter 2 Ecunemic Models 13 


AURUC ANBNC 





@ (b) 


representation not only of 4M B but also of BM A, When formalized, this result is known 
as the commutative law (of unions and intersections): 


AUB=BUA ANB=BNA 


These relations are very similar to the algebraic laws a+b =hb+aandaxhb=hxa, 

To take the union of three sets A, B, and €, we first take the union of any two sets and 
then “union” the resulting set with the third: a similar procedure is applicable to the inter- 
section operation. The results of such operations are illustrated in Fig. 2.3. It is interesting 
thal the order in which the sets are selected for the operation is immaterial. This fact gives 
tise to the assaciative law (of unions and intersections): 


AU(BUC) =(AUB)UC 
AN(BNC) = (AN BNE 


These equations are strongly reminiscent of the algebraic laws a + (b +c) = (a+) + 
anda x (6 x c}=(a x A) xe. 

There is also a law of operation that applies when unions and intersections are used in 
combination. This is the distributive law (of unions and intersections): 


AU{BNC) =(AUBINIAUC) 
AN(BUC)=(ANBUCANC) 
These resemble the algebraic law a x (6 +c) = (a x b) + (a xc). 
Verify the distributive law, given A = {4,5}, B = (3,6, 7}, and C = {2, 3}. To verify the first 
part of the law, we find the left- and right-hand expressions separately: 
Left: AU(BMC) = {4,5} U {3} = (3,4, 5} 
Right! (AU B) (AUC) = (3, 4, 5, 6, 7}. 2, 3, 4, 5} = 13, 4, 5] 


14 Part One dntroduetion 


Since the two sides yield the same result, the law is verified. Repeating the procedure for the 
second part of the law, we have 


Left: AN(BUC) = (4,5) 112, 3,6, 7} = 2 
Right! (ANB) (ANC) =BUG=B 


Thus the law is again verified. 


To verify a law means to check by a specific example whether the law actually works 
out. If the law is valid, then any specific example ought indeed to work oul. This implies 
that if the law does not check out in as many as one single example, then the law is invali- 
dated. On the other hand, the successful verification by specific examples (however many) 
does not in itself prove the taw. To prove a law, it is necessary to demonstrate that the law is 
valid for all possible cases. The procedure involved in such a demonstration will be illus- 
trated tater (see, ¢.g., Sec. 2.5). 








EXERCISE 2.3 


1. Write the following in set notation: 
(@) The set of all real numbers greater than 34. 
(b) The set of all real nurnbers greater than & but less than 65. 

2. Given the sets 5; = {2, 4, 6}, So = {7, 2, 6}, Ss = (4, 2, 6}, ard Sy = {2, 4}, which of the 
following statements are true? 


@S5=8 (d) 3 ¢ Se (9) 53 S4 
(b) 5; = R (set of real numbers) (e) 4 ¢ Ss YMC SK 
(8eh () SCR () &2D{1,2} 
3. Referring te the four sets given in Prob. 2, find: 
(QSHUS (oO 2rs 2 Gnas 
(b) SU Sy (d) 2° Sq F) 3SUSUS 
4. Which of the following statements ate valid? 
(a) AUA=A (dq) AUU =U (g) The complement of 
tb) ANA=A () ACP G=GB Aisa. 
(Q AUD =A (f) ADU=A 


§. Given A = (4, 5, 6}, 8 = {3, 4,6, 7}, and C = (2, 3, 6}, verify the distributive law. 

6. Verify the distributive law by means of Venn diagrams, with different orders of succes- 
sive shading. 

7. Enumerate all the subsets of the set {5, 6, 7}. 

8. Enumerate all the subsets of the set 5 = (a, 6, c,d}. How many subsets are there 
altogether? 

9. Example 6 shows that @ is the complement of U, But since the null set is a subset of 
any set, © must be a subset of U. Inasmuch as the term “complement of U“ implies the 
notion of being not in U, whereas the term “subset of Y” implies the notion of being in 
U, It seems paradoxical for @ to be both of these. How do you resolve this paradox? 


Chapter 2 Econumic Models 15 


2.4 Relations and Functions 





Example 1 


Example 2 


FIGURE 2.4 


Our discussion of sets was prompted by the usage of that term in connection with the vari- 
ous kinds of numbers in our number system. However, sets can refer as well to objects other 
than numbers. In particular, we can speak of sets of “ordered pairs’—to be defined 
presently—which will lead us to the important concepts of relations and functions. 


Ordered Pairs 

In writing a sct {a, 5}, we do not care about the order in which the clements a and 6 appear, 
because by definition {a, 5} = {b, a}. The pair of elements @ and 4 is in this case an un- 
ordered pair. When the ordering of a and does carry a significance, however, we can write 
two different ordered pairs denoted by (a,b) and (b,@), which have the property that 
(a, 6) # (6, a) unless a = b, Similar concepts apply to a set with more than two elements, 
in which case we can distinguish between ordered and unordered triples, quadruples, quin- 
tuples, and so forth. Ordered pairs, triples, etc., collectively can be called ordered sets; they 
are enclosed with parentheses rather than braces. 


To show the age and weight of each student in a class, we can form ordered pairs (a, w), in 
which the first element indicates the age (in years) and the second element indicates the 
weight (in pounds). Then (19, 127) and (127, 19) would obviously mean different things. 
Moreover, the latter ordered pair would hardly fit any student anywhere. 


When we speak of the set of all contestants in an Olympic game, the order in which they 
are listed is of no consequence and we have an unordered set. But the set {gold-medalist, 
silver-medalist, bronze-medalist} is an ordered triple. 


Ordered pairs, like other objects, can be elements of a set. Consider the rectangular 
(Cartesian) coordinate plane in Fig. 2.4, where an x axis and a y axis cross each other at a 
right angle, dividing the plane into four quadrants. This xy plane is an infinite set of points, 
each of which represents an ordered pair whose first element is an x value and the second 
element a v value. Clearly, the point labeled ¢4, 2) is different from the point ¢2, 4): thus 
ordering is significant here. 





(Quadrant 11) (Quadrant 1) 
(2, 4) (4.4) 
e 4 e e 
e 3 e o 
(2,2) (4, 2) 
. ° ° 
e t+ ° ° 
+—#—+ 141+ —4$— 
-4 -3 -2 -1 123 4 * 
e -iI- ° e 
(Quadrant LI) (Quadrant TV) 





16 Part Gne  friroduction 


Example 3 


With this visual understanding, we are ready to consider the process of generation of 
ordered pairs. Suppose, from two given sets, x = {1,2} and y = {3, 4}. we wish to form all 
the possible ordered pairs with the first element taken from set.x and the second element taken 
from set y. The result will, of course, be the set of four ordered pairs (1, 3), (1. 4), (2, 3), and 
(2, 4). This set is called the Cartesian product (named after Descartes), or direct product, of 
the sets x and v and is denoted by x x y (read: “x cross y”). Itis important to remember that, 
while.x and » are sets of numbers, the Cartesian product turns out to be a set of ordered pairs. 
By enumeration, or by description, we may express this Cartesian product alternatively as 


# x y= (C1, 3), 4) (2.3), 2, VI 
or xX y= (la, 6)|aexandé ey} 


The latter expression may in fact be taken as the general definition of Cartesian product for 
any given sets « and y. 

To broaden our horizon, now let both x and p include all the real numbers. Then the re- 
sulting Cartesian product 


xx y={(a,b))|ae€ Randbe R} (2.3) 


will represent the set of all ordered pairs with real-valued clements. Besides, cach ordered 
pair corresponds to 4 unique point in the Cartesian coordinate plane of Fig. 2.4, and, con- 
versely, each point in the coordinate plane also corresponds to a unique ordered pair in the 
set x x py. In view of this double uniqueness, a one-to-one correspondence is said to exist 
between the sct of ordered pairs in the Cartesian product (2.3) and the set of pomts in the 
rectangular coordinate plane. The rationale for the notation x x + 18 now easy lo perceive; 
we may associate it with the crossing of the x axis and the y axis in Fig. 2.4. A simpler way 
of expressing the set x x y in (2.3) is to write it directly as R x R; this is also commonly 
denoted by R?. 

Extending this idea, we may also define the Cartesian product of three sets x, y, and = as 
follows: 





xxyxz={(a,b.c)|aex,beycez} 


which is a set of ordered triples. Furthermore, if the sets x, y, and z each consist of all 
the real numbers, the Cartesian product will correspond to the set of all points in a three- 
dimensional space. This may be denoted by R x RX x &, or more simply, 2+. In the present 
discussion, all the variables are taken to be real-valued; thus the framework will generally 
be Ror R’,..., or R". 


Relations and Functions 

Since any ordered pair associates a y value with an x value, any collection of ordered 
pairs—any subset of the Cartesian product (2.3)}—will constitute a relation between y and 
x. Given an x value, ong or more y values will be specified by that relation. For convenience, 





we shall now write the clements of x x y generally as (x, ¥) rather than as (a, 6), as was 
done in (2.3) --where both x and y are variables. 


The set ((x, y) | y = 2x] is a set of ordered pairs including, for example, (1, 2), (0, 0), and 
(-1, -2), It constitutes a relation, and its graphical counterpart is the set of points lying on 
the straight line y = 2x, as seen in Fig. 2.5. 


FIGURE 2.5 


Example 4 


Chapter 2 kcanamie Mudely 17 











The set {(x, y)} y = xj, which consists of such ordered pairs as (1, 0), (1, 1), and (1, -4), 
constitutes another relation, In Fig, 2.5, this set corresponds to the set of all points in the 
shaded area which satisfy the inequality y = x. 


Observe that, when the x value is given, it may not always be possible to determine a 
unique y value from a relation. In Example 4, the three exemplary ordered pairs show that 
if x = I, y can take various values, such as 0, 1, or —4, and yet in cach case satisfy the 
stated relation. Graphically, two or more points of a relation may fall on a single vertical 
line in the xy plane. This is exemplified in Fig. 2.5, where many points im the shaded area 
(representing the rclation y < x) fall on the broken vertical line labeled x = a. 

Asa special case, however, a relation may be such that for each x valuc there cxists only 
one corresponding y value, The relation in Example 3 is a case in point. In such a case. yp is 
said to be a function of x, and this is denoted by » = f(x), which is read as “y equals fof 
x.” [Note: f(x) does not mean ftimes x.] A function is therefore a set of ordered pairs with 
the property that any x valuc wnigwely determines ay value." It should be clear that a func- 
tion must be a telation, but a relation may not be a function. 

Although the definition ofa function stipulates a unique y for each x, the converse is not 
required. In other words, more than one x value may legitimately be associated with the 
same y value. This possibility is illustrated in Fig. 2.6, where the values x, and x2 in the x 
set are both associated with the same value (9) in the y set by the function y = f(x). 

A function is also called a mapping, or transformation; both words connote the action of 
associating one thing with another. In the statement » = f(x), the functional notation / 





T This definition of function corresponds to what would be called a single-valued function in the older 
terminology. What was formerly called a multivalued function is now referred to as a relation or 
correspondence. 


18 PartOne fatreduction 


FIGURE 2.6 


¥ 


y= fa) 








may thus be interpreted to mean a rule by which the set x is “mapped” (“transformed”) into 
the set. Thus we may write 
firzmy 

where the arrow indicates mapping, and the letter f symbolically specifies a rule of map- 
ping. Since f represents a particular tule of mapping, a different functional notation must 
be employed to denote another function that may appear in the same model. The customary 
symbols (besides f") used for this purpose are g, F’, G, the Greek letters (phi) and 4 (psi), 
and their capitals, & and W. For instance, two variables y and z may both be functions of x, 
but if one function is written as y = f(x), the other should be written as z = g(x), or 
z= (x). It is also permissible, however, to write y = p(x) and z = z(x), thercby dis- 
pensing with the symbols fand g altogether. 

In the function y = f(x), x is referred lo as the argument of the function, and y is called 
the value of the function. We shall also alternatively refer to x as the independent variable 
and y as the dependent variable. The set of all permissible values that x can take ina given 
context is known as the domain of the function, which may be a subset of the set of all real 
numbers. The y value into which an x value is mapped is called the image of that x value. 
The set of all images is called the range of the function, which is the sct of all values that 
the y variable can take. Thus the domain pertains to the independent variable x, and the 
range has to do with the dependent variable y. 

As illustrated in Fig. 2.7a, we may regard the function fas a rule for mapping each point 
on some line segment (the domain) into some point on another line segment (the range). By 
placing the domain on the x axis and the range on the y axis, as in Fig. 2.74, however, we 
immediately obtain the familiar two-dimensional graph, in which the association between 
x values and y values is specified by a set of ordered pairs such as (x1, y1) and (x2, ¥2). 

In cconomic models, behavioral equations usually enter as functions. Since most vari- 
ables in economic models are by their nature restricted to being nonnegative real numbers,” 
theit domains are also so restricted. This is why most geometric representations in 








* We say “nonnegative” rather than “positive” when zero values are permissible. 


Chapter 20 Economic Mudeis 19 














FIGURE 2.7 
s)  t yo O82 
(Domain) (Range) 
{a) (2) 
economics are drawn only in the first quadrant. In general, we shall not bother to specify 
the domain of every function in every economic model. When no specification is given, i 
is to be understood that the domain (and the range) will only include numbers for which a 
function makes economic sense. 
Example 5 The total cost € of a firm per day is a function of its daily output Q: C = 150 + 7Q. The firm 
———— __ has a capacity limit of 100 units of output per day. What are the domain and the range of 
the cost function? Inasmuch as Q can vary only between 0 and 100, the domain is the set 
of values 0 < Q < 100; or more formally, 
Domain = [Q 10 < Q < 100} 
As for the range, since the function plots as a straight line, with the minimum C value at 150 
(when Q = 0) and the maximum C value at 850 (when Q = 100), we have 
Range = (C |150= C = 850} 
Beware, however, that the extrerne values of the range may not always occur where the 
extreme values of the domain are attained. 
EXERCISE 2.4 


1, Given 55 = 13, 6,9}, Sz = {a, b], and $3 = {m, a), find the Cartesian products: 
(0) Sy x Se ) Dx Ss © 3x5 

2. From the information in Prob. 1, find the Cartesian product $; x So $3. 

3. In general, is it true that 5; x Sp = S2 x 5;? Under what conditions will these two 
Cartesian products be equal? 

4. Does any of the following, drawn in a rectangular coordinate plane, represent a 
function? 
(a) A circle (©) Arectangle 
{b) A triangle {(d) A downward-sloping straight fine 

5. Ifthe domain of the function y = 5 + 3x is the set {x | 1 < x < 9}, find the range of the 
function and express it as a set. 


20 Part One futroduction 


6. Far the function y = —x?, if the domain is the set of all nonnegative real numbers, what 
will its range be? 

7. In the theory of the firm, economists consider the total cast C to be a function of the 
output level Q: C = f(Q). 

(a) According to the definition of a function, should each cost figure be associated with 
a unique level of output? 
(B) Should each level of output determine a unique cost figure? 

8. If an output level Q; can be produced at a cost of C1, then it must also be possible (by 
being tess efficient) to produce Q; at a cast of C, + $1, or Ci + $2, and so on. Thus it 
would seem that output Q does not uniquely determine total cost C. If so, to write 
C = f(Q) would violate the definition of a function. How, in spite of the this reasoning, 
would you justify the use of the function C = #(Q)? 


2.5 Types of Function 





The expression y = f(x) isa general statement to the effect that a mapping is possible, but 
the actual rule of mapping is not thereby made explicit, Now Let us consider several specific 
types of function, each representing a different rule of mapping. 


Constant Functions 
A function whose range consists of only one clement is called a constant function. As an 
cxample, we cite the function 


pefix)=7 


which is alternatively expressible as v = 7or f(x) = 7, whose value stays the same 
regardless of the value of x. In the coordinate plane, such a function will appear as a hori- 
zontal straight line. In national-income models, when investment / is exogenously deter- 
mined, we may have an investment function of the form / = $100 million, or / = /,, which 
exemplifies the constant function. 





Polynomial Functions 
The constant function is actually a “degenerate” case of what are known as polynomial 
functions, The word polynomial means “multilerm,” and a polynomial function of a single 
variable x has the general form 


ye dg tax + gx? foes tex” (2.4) 


in which each term contains a coefficient as well as a nonnegative-integer power of the 
variable x. (As will be explained later in this section, we can write x! =x and x = 1 in 
general; thus the first two terms may be taken to be agx® and a)x!, respectively.) Note thal, 
instead of the symbols «,b.c...., we have employed the subscripted symbols ap. 
a,,..., 4, for the coefficients. This is motivated by two considerations; (1) we can econo- 
mize on symbols, since only the letter a is “used up” in this way; and (2) the subscript helps 
to pinpoint the Jocation of a particular coefficient in the entire oquation. For instance, in 
(2.4), a is the coefficient of x7, and so orth. 


Chapter 2 keonomie Madetys 21 


Depending on the value of the integer » (which specifics the highest power of x), we 
have scveral subclasses of polynomial function: 


Case of n = 0: Y=a [constant function] 
Case ofa = 1; YH tax [dinear function} 
Case of n = 2: VY =a $a,x Fax? [quadratic function] 
Case of n = 3: yY=ataxt zx + a3x3 [cubic function] 


and so forth, The superscript indicators of the powers of x are called exponents. The high- 
est power involved, i.e. the value of w, is often called the degree of the polynomial func- 
tion; a quadratic function, for instance, is a second-degree polynomial, and a cubic function 
is a third-degree polynomial,' The order in which the several terms appear to the right of 
the equals sign is inconsequential; they may be arranged in descending order of power in- 
stead. Also, cven though we have put the symbol y on the lefi, it is also acceptable to write 
f(x) in its place. 
When plotted in the coordinate plane, a linear function will appear as a straight line, as 
illustrated in Fig. 2.82. When x = 0, the linear function yields v = ag: thus the ordered pair 
(0, ao) is on the line. This gives us the so-called v intercept (or vertical intercept), because 
it is at this point that the vertical axis intersects the line. The other coefficient, a, measures 
the s/ope (the steepness of incline) of our line, This means that a unit increase in x will re- 
sult in an increment in y in the amount of a,, What Fig, 2.8¢ illustrates is the case ofa > 0, 
involving a positive slope and thus an upward-sloping line; if a) < 0, the line will be 
lownward-sloping. 

A quadratic function, on the other hand, plots as a parabola—roughly, a curve with a 
single built-in bump or wiggle. The particular illustration in Fig. 2.8 implies a negative a2: 
in the case of az > 0, the curve will “open” the other way, displaying a valley rather than a 
hill. The graph of a cubic function will, in general, manifest two wiggles, as illustrated in 
Fig. 2.8c. These functions will be used quite frequently in the economic models subse- 
quently discussed. 








Rational Functions 
A function such as 
x-l 

v= x? 2x44 
in which y is expressed as a ratio of two polynomials in the variable x, is known as a ratio- 
nal function. According to this definition, any polynomial function must itself be a rational 
function, because it can always be expressed as a ratio to 1, and | is a constant function. 

A special rational function that has interesting applications in economics is the function 


vot oO oxy=a 
x 


which plots as a rectangular hyperbola, as in Fig. 2.8d. Since the product of the two vari- 
ables is always a fixed constant in this case, this function may be used to represent that 
special demand curve—with price P and quantity Q on the two axes—for which the total 


t In the several equations just cited, the last coefficient (a,) is always assumed to be nonzero; 
otherwise the function would degenerate into a lower-degree polynomial. 


22 PartOne Iniraduction 


FIGURE 2.8 


Linear 


veay tage 







hs 


Quadratic 
Y= dy t ayn | ayn? 


(Case of ay <0) 
































0 x 
(@) (b) 
¥ Cubic y Rectangular-hyperholic 
ay t aya yx? + aan yet 
¥ 
Is 
oO oO v 
(cy (d) 
¥ Lxpanential y Logarithmic 
yoh* y= log, « 
(bel) 
oO oO x 
te) (fh 


expenditure PQ is constant at all levels of price. (Such a demand curve is the one with a 
unitary clasticity at each point on the curve.} Another application ts to the average fixed 
cost (AFC) curve. With AFC on one axis and output @ on the other, the AFC curve must be 
rectangular-hyperbolic because AFC x Q(= total fixed cost) is a fixed constant. 


Chapter 2 Economic Models 23 


The rectangular hyperbola drawn from xy = a@ never meets the axes, even if extended 
indefinitely upward and to the right. Rather, the curve approaches the axes asymptotically: 
as y becomes very large, the curve will come ever closer to the y axis but never actually 
reach it, and similarly for the x axis. The axes constitute the asymptores of this function, 


Nonalgebraic Functions 

Any function expressed in terms of polynomials and/or roots (such as square root) of 
polynomials is an algebraic function. Accordingly, the functions discussed thus far are all 
algebraic. 

Tlowever, exponential functions such as y = b*, in which the independent variable ap- 
pears in the exponent, are zonalgebraic. The closcly related logarithmic functions, such as 
y = log, x, are also nonalgebraic. These two types of function have a special role to play in 
certain types of economic applications, and it is pedayogically desirable to postpone their 
discussion to Chap. 10. Here, we simply preview their general graphic shapes in Fig, 2.8¢ 
and f. Other types of nonalgebraic function are the trigonometric (or circular} functions, 
which we shall discuss in Chap. 16 in connection with dynamic analysis, We should add 
here that nonalgebraic fimctions are also known by the more esoteric name of franscen- 
dental functions. 


A Digression on Exponents 

In discussing polynomial functions, we introduced the term exponents as indicators of the 
power to which a variable (or number) is to be raised, The expression 6* means that 6 is to 
be raised to the second power; that is, 6 is to be multiplied by itself. or 6° = 6 x 6 = 36. In 
general, we define, for a positive integer #, 





XX KO XY 
4 terms 

and as a special case, we note that x! = x. Trom the general definition, it follows that for 

positive integers m and n, exponents abey the following rules: 





Rule I x xx "(for example, x4 x x4 = 37) 
Proor x xx" = (x xx KO XK) XXX KK) 
m terms a terms 
SxKXXeexea yee 
m—n terms 


Note that in this proof, we did not assign any specific value to the number x, or to the 
exponents m and #, Thus the result obtained is generally true. It is for this reason that 
the demonstration given constitutes a proof, as against a mere verification. The same can be 
said about the proof of Rule I] which follows. 


a 











x - . 
Rule ll [= mon (x #0) (to: example, — 
x 3 
m terms 
XT KN KX Y mon 
= =XXXX XX=X 
PROOF Wo XX Kee XE 





————_—_- d= a terms 
n terms 


24 Part One introduction 


because the # terms in the denominator cancel out 7 of the m terms in the numerator. Note 
that the case of x = 0 is ruled out in the statement of this rule. This is because when. = 0, 
the expression x”/x" would involve division by zero, which is undefined. 

What if m <n”, say, # = 2 and n = 5? In that case we get, according to Rule II, 
x = x7, a negative power of x. What does this mean? The answer is actually supplied 
by Rule i] itself: When # = 2 anda = 5, we have 





we XXX l 1 





NXE KE XYKY xxXexe 8 


Thus x"? = | /x, and this may be generalized into another rule: 


Rule IIL xe 1 (x #0) 
x 


To raise a (nonzero) number to a power of negarive n is to take the reciprocal of its nth 
power. 

Another special case in the application of Rule UJ is when m = 1, which yields the ex- 
pression x" = x"°" = x9 To interpret the meaning of raising a number x to the zeroth 
power, we can write out the term x”~” im accordance with Rule Il, with the result that 
x” fx™ = 1, Thus we may conelude that any (nonzero) number raised to the zeroth power 
is equal 10 1. (The expression 0" is undetined.} This may be expressed as another rule: 


Rule IV w=1 O&O) 


As long as we are concerned only with polynomial functions, only (nonnegative) integer 
powers are required. In exponential functions, however, the exponent is a variable (hat can 
take noninteger values as well. In order to interpret a number such as x17, let us consider 
the fact that, by Rule [, we have 

Wt? xy yl? yloy 
Since x“? multiplied by itself is x, x'/? must be the square root of x. Similarly, x!” can be 
shown to be the cube root of x, In general, therefore, we can state the following rule: 


RuleV xi = Vy 


Two other rules obeyed by exponents are 








Rule V1 "y= 
Rule Vl x" x y" = (xy) 
EXERCISE 2.5 
1. Graph the functions 
(a) y= 16 42x (b) y= 82x ( y=2x412 


{In each case, consider the domain as consisting of nonnegative real numbers only.) 

2. What is the major difference between (a) and (b) in Prob. 1? How is this difference re- 
flected in the graphs? What is the major difference between (a) and (c)? How do their 
graphs reflect it? 


Chapter 2. Eeunomic Models 25 


3. Graph the functions 
(@y=—-x?45x-2 (db) y= x? +5x-2 
with the set of values —5 < x < 5 constituting the domain. It is well known that the 
sign of the coefficient of the x? term determines whether the graph of a quadratic func- 
tion will have a “hill” or a “valley.” On the basis of the present problem, which sign is 
associated with the hill? Supply an intuitive explanation for this. 

4. Graph the function y = 36/x, assuming that x and y can take positive values only. Next, 
suppose that both variables can take negative values as well; how must the graph be 
modified to refiect this change in assumption? 

5. Condense the following expressions: 

(a) xt x x!8 (BY x7 x xP x xe Oexpxe 

6. Find: (a) 3/477 (8) (8? x PY 2B 

7, Show that x7! == </x7i = (4/x)", Specify the rules applied in each step. 

8. Prove Rule VI and Rule Vil. 


2.6 Functions of Two or More Independent Variables 





Thus far, we have considered only functions of a stngle independent variable, y = fx). 
But the concept of a function can be readily extended to the case of two or more indepen- 
dent variables. Given a function 


r= e2(x.¥) 


a given pair of x and y values will uniquely determine a valuc of the dependent variable z. 
Such a function is exemplified by 


reaxthy or z=a—axtax’t+hythoy 


Just as the function » = f{x) maps a point in the domain into a point in the range, the 
function y will do precisely the same. However, the domain is in this case no longer a sct of 
numbers but a set of ordered pairs (x, y), because we can determine z only when both x 
and y are specified. The function g is thus a mapping from a point in a two-dimensional 
space into a point on a line segment (i.e., a point in a one-dimensional space), such as from 
the point (x), 4) into the point z)or from (x, ¥2) into z: in Fig. 2.9a. 

Ifa vertical z axis is erected perpendicular to the xy plane, as is done in diagram 4, how- 
ever, there will result a three-dimensional space in which the function g can be given a 
graphical representation as follows. The domain of the function will be some subset of 
the points in the xy plane, and the value of the function (valuc of z) for a given point in the 
domain —say, (x1, ¥1)---can be indicated by the height of a vertical line planted on that 
point, The association between the three variables is thus summarized by the ordered triple 
(x1, 91. 21), which ig a specific point in the three-dimensional space. The locus of such or- 
dered triples, which will take the form of a surface, then constitutes the graph of the func- 
tion g. Whereas the function ¥ = f(x) is a sct of ordered pairs, the function z = g(x, ¥) 
will be a set of ordered triples. We shall have many occasions to use functions of this type 











26 Part One dntraduction 


FIGURE 2.9 











in economic models. One ready application is in the arca of production lunctions. Suppose 
that output is determined by the amounts of capital (K} and labor (L) employed; then 
we can write a production function in the general form Q = QO(K, L). 

The possibility of further extension to the cascs of three or more independent variables 
f-evident. With the function y = A(u, 0, w}, for example, we can map a point in 
the three-dimensional space, (u), v1, #1), into a point in a one-dimensional space (1). 
Such a [unetion might be used to indicate that a consumer’s utility is a function of his or her 
consumption of three different commoditics, and the mapping is from a three-dimensional 
commodity space into a one-dimensional utility space. But this time it will be physically 
impossible to graph the function, because for that task a four-dimensional diagram is 
needed to picture the ordered quadruples, but the world in which we live is only threc- 
dimensional. Nonetheless, in view of the intuitive appeal of geometric analogy, we can con- 
tinue to refer to an ordered quadruple (uz), v1, wy, ¥1) a8 a “point” in the four-dimensional 
space. The locus of such points will give the (nongraphable} “graph” of the function 
y = Alu, v, w), which is called a hypersurface. These terms, viz., point and hypersurface, 
are also carried over to the general case of the #-dimensional space. 

Functions of more than one variable can be classified into various types. too. For in- 
stance, a function of the form 








Yo ayy + agxg +1 + ay xy 


Chapter2 © Economic Models 27 


is a linear function, whose characteristic is that every variable is raised to the first power 
only. A quadraric function, on the other hand, involves first and second powers of one or 
more independent variables, but the sum of exponents of the variables appearing in any sin- 
gle term must not exceed 2. 

Note that instead of denoting the independent variables by x, u,v, w, etc., we have 
switched to the symbols xy, x2, ...,x,. The latter notation, like the system of subscripted 
coefficients, has the merit of economy of alphabet, as well as of an easier accounting of the 
number of variables involved in a function. 


2.7 Levels of Generality 





In discussing the various types of function, we have without explicit notice introduced 
examples of functions that pertain to varying levels of generality. In certain instances, we 
have written functions in the form 


yo pobrt4 por 3edel (etc.) 


Not only are these expressed in terms of numerical cocfficients, but they also indicate 
specifically whether each function is constant, linear, or quadratic. In terms of graphs, each 
such function will give rise to a well-defined unique curve. In view of the numerical nature 
of these functions, the solutions of the model based on them will emerge as numerical val- 
ues also, The drawback is that, if we wish to know how our analytical conclusion will 
change when a different set of numerical coefficients comes into effect, we must go through 
the reasoning process afresh each time. Thus, the results obtained from specific functions 
have very little generality. 
On a more general level of discussion and analysis, there are functions in the form 


yoa  ysathx ysathxtex* (ete) 


Since parameters ure used, each function represents not a single curve but a whole family 
of curves. The function » =a, for instance, encompasses not only the specific cases 
y =0,y =1,and vy = 2 butalso py = 4 y=-—5,..., ad infinitum. With parametric func- 
tions, the outcome of mathematical operations will also be in terms of parameters. These 
results are more general in the sensc that, by assigning various values to the parameters ap- 
pearing in the solution of the model, a whole family of specific answers may be obtained 
without having to repeat the reasoning process anew. 

In order to attain an even higher level of gencrality, we may resort to the general fune- 
tion statement y = f(x), or z = g(x, y). When expressed in this form, the function is not 
restricted to being either linear, quadratic, exponential, or trigonometric—all of which are 
subsumed under the notation. The analytical result based on such a general formulation 
will therefore have the most general applicability. As will be found below, however, in order 
to obtain economically meaningful results, it is often necessary to impose certain qualita- 
tive restrictions on the general functions built into a model, such as the restriction that a 
demand function have a negatively sloped graph or that a consumption function have a 
graph with a positive slope of less than 1. 

To sum up the present chapter, the structure of a mathematical economic model is 
now clear, In general, it will consist of a system of equations, which may be definitional, 





28 Part One Introduction 


behavioral, or in the nature of cquilibrium conditions.’ The behavioral equations are usu- 
ally in the form of functions, which may be linear or nonlinear, numerical or parametric, 
and with one independent variable or many. It is through these that the analytical assump- 
tions adopted in the model are given mathematical expression. 

In attacking an analytical problem, theredore, the first step is to select the appropriate 
yariables—exogenous as well as endogenous—for inclusion in the model. Next, we must 
translate into equations the set of chosen analytical assumptions regarding (he human, in- 
stitutional, technological, legal, and other behavioral aspects of the cnvironment affecting 
the working of the variables. Only then can we attempt to derive a set of conclusions 
through relevant mathematical operations and manipulations and to give them appropriate 
economic interpretations. 


* Inequalities may also enter as an important ingredient of a model, but we shall not worry about 
them for the time being. 


Part 
Static (or Equilibrium) ) 
Analysis 





Chapter 


Equilibrium Analysis 
in Economics 


The analytical procedure outlined in Chap. 2 will first be applied to what is known as static 
analysis, or equilibrium analysis. For this purpose, it is imperative first to have a clear 
understanding of what equilibrium means. 


3.1 The Meaning of Equilibrium 





30 


Like any economic term, equilibrium can be defined in various ways. According to one 
definition, an equilibrium is “a constellation of selected interrelated variables so adjusted 
to one another that no inherent tendency to change prevails in the model which they con- 
stitute”! Several words in this definition deserve special attention. First, the word selected 
underscores the fact that there do exist variables which, by the analyst’s choice, have not 
been included in the model. Hence the equilibrium under discussion can have relevance 
only in the context of the particular set of variables chosen, and if the model is enlarged to 
include additional variables, the equilibrium state pertaining to the smaller model will no 
longer apply. 

Second, the word interrelated suggests that, in order for equilibrium to occur, all vari- 
ables in the model must simultaneously be in a state of rest. Moreover, the state of rest of 
each variable must be compatible with that of every other variable; otherwise some vari- 
able(s) will be changing, thereby also causing the others to change in a chain reaction, and 
no equilibrium can be said to exist. 

Third, the word inherent implies that, in defining an equilibrium, the state of rest in- 
volved is based only on the balancing of the internal forces of the madel, while the exter- 
nal factors are assumed fixed. Operationally, this means that parameters and exogenous 
variables are treated as constants. When the external factors do actually change, there will 
be a new equilibrium defined on the basis of the new parameter valucs, but in defining the 
new equilibrium, the new parameter values arc again assumed to persist and stay 





unchanged. 


t Fritz Machlup, “Equilibrium and Disequilibrium: Misplaced Concreteness ancl Disquised Politics,” 
Economic journal, March 1958, p. 9. (Reprinted in F, Machlup, Essays on Economic Semantics, 
Prentice Hall, Inc., Englewood Cliffs, N.J.. 1963.) 


Chapter 3. Eguilibrium Anafysis in Feomomics 31 


Tn essence, an equilibrium for a specified model is a situation characterized by a lack of 
tendency to change. It is for this reason that the analysis of equilibrium (more specifically, 
the study of what the equilibrium state is like) is referred to as statics. 

The fact that an equilibrium implies no tendency to change may tempt one to conclude 
that an equilibrium necessarily constitutes a desirable or ideal state of affairs, on the 
ground that only in the ideal state would there be a lack of motivation for change. Such a 
conclusion is unwarranted. Even though a certain equilibrium position may represent a 
desirable state and something to be striven for—such as a profit-maximizing situation, 
from the firm’s point of view—another equilibrium position may be quite undesirable and 
therefore something to be avoided, such as an underemployment equilibrium level of 
national income. The only warranted interpretation is that an equilibrium is a situation 
which, if attained, would tend to perpetuate itself, barring any changes in the external 
forces. 

The desirable variety of equilibrium, which we shail refer to as goal equilibrium, will be 
treated later in Part 4 as optimization problems. In the present chapter, the discussion will 
be confined to the zongoal type of equilibrium, resulting not from any conscious aiming at 
a particular objective but from an impersonal or suprapersonal process of interaction and 
adjustment of economic forces. Examples of this are the equilibrium attained by a market 
under given demand and supply conditions and the equilibrium of national income under 
given conditions of consumption and investment patterns, 


3.2. Partial Market Equilibrium—A Linear Model 





In a static-equilibrium model, the standard problem is that of finding the set of values of the 
endogenous variables which will satisfy the equilibrium condition of the model. This is 
because once we have identified those values, we have in effect identified the equilibrium 
state. Let us illustrate with a so-called partial-equilibrium market model, i.e., a model of 
price determination in an isolated market. 


Constructing the Model 

Since only one commodity is being considered, it is necessary to include only three vari- 
ables in the model: the quantity demanded of the commodity (Q,), the quantity supplied 
of the commodity (@,), and its price (P). The quantity is measured, say, in pounds per 
week, and the price in dollars. Having chosen the variables, our next order of business is 
to make certain assumptions regarding the working of the market. First, we must specify 
an equilibrium condition: -somcthing indispensable in an equilibrium model. The stan- 
dard assumption is that equilibrium occurs in the market if and only if the excess demand 
is zero (Qu — Qs = 0), that is, if and only if the market is cleared. But this immediately 
raises the question of how Q, and Q, themselves are determincd. To answer this, we 
assume that Q, is a decreasing linear function of P (as P increases, Q, decreases). On 
the other hand, Q, is postulated tobe an increasing linear function of P (as P increases, 
so does Q,), with the proviso that no quantity is supplied unless the price exceeds a par- 
ticular positive level. In all, then, the model will contain one equilibrium condition 
plus two behavioral equations which govern the demand and supply sides of the market, 
respectively. 





32 Part Two Static for Equilibrium) Analysis 


FIGURE 3.1 


0,9; 






Q,=-e+ dP 
(supply) 


0 = G4 = Qt &----------- {P*. 0%) 








Translated into mathematical statements, the model can be written as 


Qa = Os 
Og=a-bP {a,b > 0) (3.1) 
Q, = -c+dP (c,d > 0) 


Four parameters, a, 6, c, and d, appear in the two linear functions, and all of them are spec- 
ified to be positive. When the demand function is graphed, as in Fig, 3.1, its vertical inter- 
cept is at a and its slope is —h, which is negative, as required. The supply function also has 
the required type of slope, ¢ being positive, but its vertical intercept is scen to be negative, 
at —c. Why did we want to specify such a negative vertical intercept? The answer is that, in 
so doing, we force the supply curve to have a positive horizontal intercept at P,, thereby sat- 
isfying the proviso stated earlier that supply will not be forthcoming unless the price is pos- 
itive and sufficiently high. 

The reader should observe that, contrary to the usual practice, quantity rather than price 
has been plotted vertically in Fig. 3.1. This, however, is in line with the mathematical con- 
vention of placing the dependent variable on the vertical axis, In a different context in 
which the demand curve is viewed from the standpoint of a business firm as describing the 
average-revenue curve, AR = P = f(Q,), we shall reverse the axcs and plot P vertically. 

With the model thus constructed, the next step is to solve it, i.c., to obtain the solution 
values of the three endogenous variables, O,, Q,, and P. The solution values are those 
values that satisfy the three equations in (3.1) simultaneously; ie. they are the values 
which, when substituted into the three equations, make the latter a set of true statements. In 
the context of an equilibrium model, those values may also be referred to as the equilibrium 
values of the said variables. 

Many writers employ no special symbols to denote the solution values of the endoge- 
nous variables. Thus, Q, is used ta represent cither the quantity-demanded variable (with a 
whole range of values) ar its solution value (a specific value); and similarly for the symbols 


Chapter 3 Eyuilibrinm Analysis in Economticy 33 


Q, and P. Unfortunately, this practice can give rise to possible confusions, especially in the 
context of comparative-static analysis (e.g., Sec. 7.5). To avoid such a source of confusion, 
we shall denote the solution value of an endogenous variable with an asterisk. Thus, the 
solution values of Q,, Q,, and P, are denoted by Q%, 0%, and P*, respectively. Since 
"; = QO}, however, they can even be replaced by a single symbol Q*. Hence, an equilib- 
tium solution of the model may simply be denoted by an ordered pair ( P*, Q*). In case the 
solution is not unique, several ordered pairs may each satisfy the system of simultaneous 
equations; there will then be a solution set with more than one element in it. However, the 
tultiple-cquilibrium situation cannot arise in a linear model such as the present one. 


Solution by Elimination of Variables 
One way of finding a solution to an equation system is by successive elimination of vari- 
ables and equations through substitution. In (3.1), the mode! contains three equations in 
three variables. However, in view of the equating of Q, and Q, by the equilibrium condition, 
we can let 0 = Oz = Q, and rewrite the model equivalently as follows: 

O=a—bP 

Q=-c+dP 






(3.2) 


thereby reducing the model to two equations in two variables. Moreover. by substituting the 
first equation into the second in (3.2), the model can be further reduced to a single equation 
in a single variable: 


a@—-bP=-c+dP 


or, after subtracting (a + ¢?) from both sides of the equation and multiplying through 
by ~1, 


(64+d)P=ate 3.3) 


This result is also obtainable directly from (3.1) by substituting the second and third equa- 
tions into the first. 
Since 6 + d # Q, itis permissible to divide both sides of (3.3) by (4+ @). The result is 
the solution value of P: 
2 ate 


an G4) 





Note that P* is—as all solution values should be—expressed entirely in terms of the 
parameters, which represent given data for the model. Thus ?* is a determinate value, as 
it ought to be. Also note that P* is positive—as a price should be—because all] the four 
paramcters are positive by model specification. 

To find the equilibrium quantity G* (= Q% = Q*) that corresponds to the value P*, 
simply substitute (3.4) into eier equation of (3.2), and then solve the resulting equation. 
Substituting (3.4) into the demand function, for instance, we cgn get 


_ bate _ ah+d}—blate) — ad—be 
btd bid ~ b+ 








Qr=a (3.5) 


34. Part Two. Static for Equilibrium) Analysis 


which is again an expression in terms of paramcters only. Since the denominator (6 +d) is 
positive, the positivity of Q* requires that the numerator (ad — 6c) be positive as well. 
Hence. to be economically meaningful, the present model should contain the additional 
restriction that ad > be. 

The meaning of this restriction can be seen in Fig. 3.1. It is well known that the P* and 
Q* of a market model may be determined graphically at the intersection of the demand and 
supply curves, To have Q* > 0 is to require the intersection point to be located above the 
horizontal axis in Fig. 3.1, which in turn requires the slopes and vertical intercepts of the 
two curves to fulfill a certain restriction on their relative magnitudes. That restriction, 
according to (3,5), is ad > be, given that both } and d are positive. 

The intersection of the demand and supply curves in Fig. 3.1, incidentally, is in concept 
no diffcrent from the intersection shown in the Venn diagram of Fig. 2.2. There is one dif- 
ference only: Instead of the points lying within two circles, the present case involves the 
points that lie on two lines, Let the set of points on the demand and supply curves be 
denoted, respectively, by D and S. Then, by utilizing the symbol @ (= Og = Q,), the two 
sets and their intersection can be written 


D=((P,Q)| Q=a-6P} 
S=((P,Q)|Q=-ct+dP} 
and DAS=(P*,Q") 


The intersection set contains in this instance only a single element, the ordered pair 
(P*, Q*). The market equilibrium is unique. 





EXERCISE 3.2 
1. Given the market madel 
Qa = Qs 
Qu = 21—~3P 
Q=-448P 


find P* and Q” by (a) elimination of variables and (6) using formulas (3.4) and (3.5). 
(Use fractions rather than decimats.) 
2, Let the demand and supply functions be as follows: 
(a) Q¢= 51-3? {b) Qa = 30-2P 
Q, = OP -10 Q,=-6+5P 
find P* and Q" by elitnination of variables. (Use fractions rather than decimals.} 

3. According to (3.5), for Q* ta be positive, it is necessary that the expression (ad — bc} 
have the same algebraic sign as (b+ d). Verify that this condition is indeed satisfied in 
the models af Probs. 1 and 2. 

4. if (b+ d)=0 in the linear market model, can an equilibrium solution be found by 
using (3.4) and (3.5)? Why or why not? 

5. if (b+ ¢) =0 in the linear market model, what can you conclude regarding the posi- 
tions of the demand and supply curves in Fig. 3.1? What can you conclude, then, 
regarding the equilibrium solution? 





3.3 


Chapter 3. kynilibrium Analysis in Economics 35 


Partial Market Equilibrium—aA Nonlinear Model 





Let the linear demand in the isolated market model be replaced by a quadratic demand 
function, while the supply function remains linear, Also, let us use numerical coefficients 
rather than parameters. Then a model such as the following may emerge: 


Oy = Qy 
O¢=4-P? (3.6) 
O.=4P-1 


As previously, this system of three equations can be reduced to a single equation by climi- 
nation of variables (by substitution): 


4-P?=4p-] 
or 
PP44P—5=0 G7) 


This is a quadratic equation because the left-hand expression is a quadratic function of vari- 
able P, A major difference between a quadratic equation and a linear one is that, in gencral, 
the former will yield two solution values. 


Quadratic Equation versus Quadratic Function 

Before discussing the method of solution, a clear distinction should be made between the 
two terms quadratic equation and quadratic function. According to the earlier discussion, 
the expression P? + 4P — 5 constitutes a quadratic function, say, f(P). Hence we may write 


f(Py= PP +4P-5 (3.8) 
What (3.8) does is to specify a rule of mapping from P to /(P), such as 
Pfu [-6|-s[-4[-3|-2| 1] oli fay. 
«1 7] o[-s[-al-sf-8[-sfolz|.. 





Although we have listed only nine P values in this table, actually a// the P values in the do- 
main of the function are eligible for listing, It is perhaps for this reason (hal we rarely speak 
of “solving” the equation {(P) = P? +4P ~5, because we normally expect “solution 
values” to be few in number, bul here all P values can get involved. Nevertheless, one may 
legitimately consider each ordered pair in the table—such as (—6, 7) and (—5, 0)—as a so- 
lution of (3.8), since each such ordered pair indeed satisfies that equation. Inasmuch as an 
infinite number of such ordered pairs can be written, one for each P value, there is an infi- 
nite number of solutions to (3.8). When plotted as a curve, these ordered pairs together 
yield the parabola in Fig. 3.2. 

In (3.7), where we set the quadratic function /(P) equal to zero, the situation is funda- 
mentally changed. Since the variable f(P} now disappears (having been assigned a 
zero value), the result is a quadratic equation in the single variable P.1 Now that ({P) is 


+ The distinction between quadratic function and quadratic equation just discussed can be extended 
alsa to cases of polynomials other than quadratic. Thus, a cubic equation results when a cubic 
function is set equal to zero. 


36 Part Two Static for Equilibrium) Analysis 


FIGURE 3.2 


f(P) = PP +4P-5 











restricted to a zero value, only a select number of P values can satisfy (3.7) and qualify as 
its solution values, namely, thosc P values at which the parabola in Fig. 3.2 intersects the 
horizontal axis—- on which f(P) is zero. Note that this time the solution values are just P 
values, not ordered paits. The solution P values are often referred to as the roots of the qua- 
dratic equation {(P) = 0, or, alternatively, as the zeros of the quadratic fimction f(P). 

There are two such intersection points in Fig. 3.2, namely, (1, 0) and (—5, 0). As re- 
quired, the second element of each of these ordered pairs (the ordinate of the correspond- 
ing point) shows f(P) = 0 in both cases. The first clement of each ordered pair (the 
abscissa of the point), on the other hand, gives the solution value of P. Here we get two 
salutions, 


Peal and PR =—3 


but only the first is economically admissible, as negative prices arc ruled out. 


The Quadratic Formula 
Equation (3.7) has been salved graphically, but an algebraic method is also available. In 
general, given a quadratic cquation in the form 
arthe+e=0 (a#0) 3.9) 
there are two roots, which can be obtained from the quadratic formula: 


—b+(B2 — dae)!” 


7 (3.10) 


XTX = 

where the + part of the + sign yields x} and the — part yields x7. 
Also note that as long as b? — 4ac > 0, the values of x} and x3 would differ, giving us 
two distinct real numbers as the roots. But in the special case where b° —4ac = 0, we 


Chapter 3 Lguilibriuw Analysis in Economics 37 


would find that xf = x} = —b/2a. In this case, the two roots share the identical value; they 
are referred to as repeated roots. In yet another special case where b? — dac < 0, we would 
have the task of taking the square root of a negative number, which is not possible in the 
teal-number system. In this latter case, no teal-valucd roots exist. We shall discuss this 
matter further in Sec. 16.1. 

This widely used formula is derived by means of a process known as “completing the 
square.” First, dividing each term of (3.9) by @ results in the equation 


2,6 c 
x +-x+-=0 
a a 


Subtracting c/a from, and adding 4°/4a? to, both sides of the equation, we get 


ay Bb L B Poe 
* a 4a? 4a? 
The left side is now a “perfect square,” and thus the equation can be expressed as 


2. 2 = 4ac 
Soa) a8 


or, after taking the square root on both sides, 


b (b? = dac)!? 
— =~. 
+ 2a 2a 


Finally, by subtracting 6/2a from both sides, the result in (3.10) is obtained. 
Applying the formula to (3.7), where a = 1, b = 4, c= —5, and « = P, the roots are 
found to be 


—44(1l6+20)'? 446 -1-s 


PL PP= , 
me 2 2 


which check with the graphical solutions in Fig. 3.2. Again, we reject P} = —5 on eco- 
nomic grounds and, after omitting the subscript |, write simply P* = 1. 

With this information in hand, the equilibrium quantity Q* can readily be found from 
either the second or the third equation of (3.6) to be Q* = 3. 


Another Graphical Solution 

One method of graphical solution of the present model has been presented in Fig. 3.2. 
However, since the quantity variable has been eliminated in deriving the quadratic equa- 
tion, only P* can be found from that figure. If we are interested in finding P* and Q* 
simultaneously from a graph, we must instead use a diagram with Q on one axis and P on 
the other, similar in construction to Fig. 3.1. This is illustrated in Fig. 3.3. Our problem is 
of course again to find the intersection of two scts of points, namely, 


D=((P,Q)1Q=4- P*} 
and S={(P,Q)|Q@=4P-1} 


38 Part Two Static (ur Equilibrium) Analysis 


FIGURE 3.3 


Example 1 











If no restriction is placed on the domain and the range, the intersection sel will contain two 
clements, namely, 


DAS ={(1,3), (-5, -21)} 


The former is located in quadrant 1, and the latter (not drawn) in quadrant IIL [fthe domain 
and range are restricted to being nonnegative, however, only the first ordered pair (J, 3} can 
be accepted. Then the equilibrium is again unique. 


Higher-Degree Polynomial Equations 

Ifa system of simultancous equations reduces not to a linear equation such as (3.3 )' or te 
a quadratic equation such as (3.7) but to a cubic (third-degree polynomial) equation or 
quartic (fourth-degree polynomial) equation, the roots will be more difficult to find. One 
useful method which may work is that of factoring the function. 


The expression x3 — x2 4x44 can be written as the product of three factors (x— 1), 
(x +2), and (x — 2). Thus the cubic equation 


B-7-4x4450 
can be written after factoring as 
(X= 10 +(x =0 


In order for the left-hand product to be zero, at least one of the three terms in the product 
must be zero. Setting each term equal to zero in turn, we get 


x-1=0 of x42=0 or x-2=0 


These three equations will supply the three roots of the cubic equation, namely, 


xsl X=-2 and =z 


* Equation (3.3) can be viewed as the result of setting the linear function (b+ d)P - (a+) equal to 
zero. 


Example 2 


Chapter 3 Equilibritm Analysis in Economics 39 


Example 1 illustrates two interesting and useful facts about factoring. First, given a 
third-degree polynomial equation, factoring results in three terms of the form (x — root), 
thus yielding three roots, Generally, an nth-degree polynomial equation should yield a total 
of # roots. Second, and more important for the purpose of root search, we note the follow- 
ing relationship between the three reots (L, —2, 2) and the constant term 4: Since the con- 
slant term must be the product of the three roots, each root must be a divisor of the constant 
term. This relationship can be formalized in the following theorem: 


Theorem | Given the polynomial equation 
x bag x") fee bax tay =O 


where all the coefficients are integers, and the coefficient of x” is unity, if there exist inte- 
ger roots, then cach of them must be a divisor of ag. 

Sometimes, however, we encounter fractional coefficients in the polynomial equation, 
as in 

x'+h3- te lox+6=0 

which does not fall under the provision of Theorem I. Even if we multiply through by 2 to 
get rid of the fractions (ending in the form shown in Example 2 which follows), we still 
cannot apply Theorem I, because the coellicient of the highest-degree term is not unity. In 
such cases, we can resort to a more general theorem: 


Theorem [1 Given the polynomial equation with integer coefficients 
a,x" + ay x7) 4-0 bax tay =0 


if there exists a rational roat r/s, where r and s are integers without a common divisor 
except unity, then r is a divisor of ag, and s is a divisor of a,. 


Does the quartic equation 
2x4 + 5x3 ~ 11x? — 20x+12=0 


have rational roots? With a9 = 12, the only possible values for the numerator rin r/s are the 
set of divisors (1, -1, 2, -2, 3, —3, 4, -4, 6, —6, 12, —12}. And, with a, = 2, the only possi- 
ble values for s are the set of divisors (1, -1, 2, -2}. Taking each element in the ¢ set in turn, 
and dividing it by each element in the s set, respectively, we find that r/s can only assume 
the values 


11 33 
5) -5,2,-2, 3, -3, =, -=,4, -4, 6, -6, 12, -12 
15175223, “3, 5-5, 4, 4, 6, 6, 12, 


1-1 
Among these candidates for roots, many fail to satisfy the given equation, Letting x =1 in 
the quartic equation, for instance, we get the ridiculous result —12 = 0. In fact, since we are 
solving a quartic equation, we can expect at most four of the listed r/s values to qualify as 
roots. The four successful candidates turn out to be 4 2, -2, and —3. According to the 
factoring principle, we can thus write the given quartic equation equivalently as 


@-D-DEtDAt+N=0 


where the first factor can also be written as (2x — 1) instead. 


40° Part Two Static (or Equilibrium) Analysis 


In Example 2, we rejected the root candidate 1 because x = | fails to satisfy the given 
equation; Le.. substitution of x = | imto the equation does not praduce the identity 0 = 0 
as required. Now consider the case where x = | indeed is a root of some polynomial equa- 
tion. In that case, since x" = y"~! =... = x = 1. the polynomial equation would reduce 
to the simple form a, + dy) +++ +a) +ay = 0. This fact provides the rationale for the 
following theorem: 


Theorem TI Given the polynomial equation 
aX" + a! Fe ba Fag =O 


if the coefficients a, dy—}...-. dy add up to zero, then x = | is a root of the equation. 





EXERCISE 3.3 


1. Find the zeros of the following functions graphically: 
(0) f0) =x? - 8x4 15 (B) g(x) = 2x? —4x—-16 
2. Solve Prob. 1 by the quadratic formuta. 
3. (@) Find a cubic equation with roots 6, —1, and 3. 
(b) Find a quartic equation with roots 1, 2, 3, and 5. 
4. For each of the following polynomial equations, determine if x= 1 is a root. 
(@) 3 — 2x? ~3x-2=0 (3x42? + 2x-4=0 
(b) 2x3 — dx? tx-2=0 
5. Find the rational roots, if any, of the following: 
(a) x} - 4x2 ~x+6=0 (Q 8 4322 -3x- 4 =0 
(6) 8x3-+6x2-3x-1=0 (a) x4- 6x7 + 73x? -3x-2=0 
6. Find the equilibrium solution for each of the following models: 


(a) Qg = Qs (6) Q9=Q 
Qa =3 = P? Qy = B— p? 
Q,=6P -4 Q,= P?-2 


7. The market equilibrium condition, Qy = Qs, is often expressed in an equivalent alter- 
native form, Qa — Q; = 0, which has the economic interpretation “excess demand is 
zero.” Does (3.7) represent this latter version of the equilibrium condition? If not, sup- 
ply an appropriate economic interpretation for (3.7). 


3.4 General Market Equilibrium 





The last two sections dealt with models of an isolated market, wherein the Q, and Q, ofa 
commodity are functions of the price of that commodity alone. In the actual world, though, 
no commodity ever enjoys (or suffers) such a hermitic existence; for every commodity, 
there would normally exist many substitutes and complementary goods. Thus a more real- 
istic depiction of the dernand function of a commodity should take into aecount the effect 
not only of the price of the commodity itself but also of the prices of related commodities 
The same also holds true for the supply function. Once the prices of other commodities are 


Chapter 3. Equilibrium Analysis in Eeonomies 41 


brought into the picture, however, the structure of the model itsclf must be broadened so as to 
be able to yield the equilibrium values of these other prices as well. As a result, the price and 
quantity variables of multiple commodities must enter endogenously into the model en masse. 

In an isolated-market model, the equilibrium condition consists of only one equation. 
Og = Q,. or E = Oy = G, = 0, where £ stands for excess demand, When several inter- 
dependent commodities are simultaneously considered, equilibrium would require the 
absence of excess demand for cach and every commodity included in the model, for if so 
much as one commodity is faced with an excess demand, the price adjustment of that com- 
modity will necessarily affect the quantitics demanded and quantities supplicd of the 
related commodities, thereby causing price changes all around. Consequently, the equilib- 
tium condition of an n-commodity market model will involve n equations, one for cach 
commodity, in the form 


B=Qu~Oc=0 (6 =1,2,...,.m) (3.11) 


Ifa solution exists, there will be a set of prices P* and corresponding quantilies OF such 
that all the # equations in the equilibrium condition will be simultaneously satisfied. 





Two-Commodity Market Model 

To illustrate the problem, let us discuss a simple model in which only two commodities are 
related to cach other. For simplicity, the demand and supply functions of both commodities 
are assumed to be linear. In parametric terms, such a model can be written as 


Qn - Qn =9 
On = ay 4 a Pi + aP2 
On = by +b) Pi + Py 
Quiz — On2 = 0 
Quz = a9 +o Pi + oy Pp 
O.2 = Bot BrPi + Bo P2 





(3.12) 


where the « and 6 coefficients pertain to the demand and supply functions of the first com- 
modity, and the « and f coeflicients are assigned to those of the second. We have not both- 
ered to specify the signs of the coeliicients, but in the course of analysis certain restrictions 
will emerge as a prerequisite to economically sensible results, Also, in a subsequent numer- 
ical example, some comments will be madc on the specific signs to be given the coefficients. 

Asa first step toward the solution of this model, we can again resort to elimination of 
variables. By substituting the second and third equations into the first (for the first com- 
modity) and the fifth and sixth equations into the fourth (for the second commodity), the 
model is reduced to two equations in two variables: 


(do — 89) + (a — Py) Py + (@z — bn) Pp = 0 
(a — Bo) + (an ~ BP + (0 — fo) P) = 90 
These represent the two-commodity version of (3.11), after the demand and supply func- 
tions have been substituted into the two equilibrium conditions, 


Although this is a simple system of only two equations, as many as 12 parameters are 
involved, and algebraic manipulations will prove unwieldy unless some sort of shorthand 






(3.13) 


42 Part Two Static (or Equilibrium} Analvsis 


is introduced, Let ys therefore define the shorthand symbols 
c; =a; — by G=0,1,D 
P=), 
w= a — Bi 
Then, after transposing the cp and yp terms to the right-hand side, we get 
Py tog Ps = —e 
WP + e2Pe 0 3.13’) 
VP + y2P2 = —Yo 

which may be solved by further elimination of variables. From the first equation, it can be 
found that P; = —(cg + ¢) P|)/e2. Substituting this into the second equation and solving, 
we get 

pre CMO 7 072 (3.14) 

cry. — ©2Y¥1 

Note that P* is entirely expressed, as a solution value should be, in terms of the data 
(parameters) of the model. By a similar process, the equilibtium price of the second com- 
modity is found to be 

Pra coy = 1% (3.15) 

e1y2 — €2¥1 

For these two values to make sense, however, certain restrictions should be imposed on the 
model. First, since division by zero is undefined, we must require the common denemina- 
tor of (3.14) and (3.15) to be nonzero, that is, cys # ¢2y1, Second, to assure positivity, the 
numerator must have the same sign as the denominator. 

The equilibrium prices having been found, the equilibrium quantities Q7 and Q} can 
readily be calculated by substituting (3.14) and (3.15) into the second (or third) equation 
and the fifth (or sixth) equation of (3.12). These solution values will naturally also be ex- 
pressed in terms of the parameters. (Their actual calculation is left to you as an exercise.) 


Numerical Example 
Suppose that the demand and supply functions are numerically as follows: 
Qn = 10-2 + 
OQ = -243P, 
On= 14 R- Pf 
QO =-1 4+ 2P, 


(3.16) 


What is the cquilibrium solution? 

Before answering the question, let us take a Jook at (he numerical coefficients. For each 
commodity, Q,; is seen to depend on P; alone, but Q4; is shown as a function of both 
prices. Note that while P; has a negative coefficient in Og, as we would cxpect, the coef- 
ficient of P; is positive, The fact that a rise in P) tends to raise Og, suggests that the two 
commodities are substitutes for cach other. The role of ?\ in the Qa function has a similar 
interpretation. 

With thes¢ coefficients, the shorthand symbols ¢; and ; will take the following values: 


q=l-(-2)=12 cq =-2-3=-5) Qsl-0=l 
we=S-(-l)=l6 yw=1-0=) y= -l-2=-3 


Chapter 3 Aquilibrium Analysis in Economics 43 


By direct substitution of these into (3.14) and (3.15), we obtain 


m2 35), 4 92 
Pl=q=35 and Ppa qe 


And the further substitution of Py‘ and Py into (3.16) yields 
O=F$=% and Ob=8=s221 


Thus all the equilibrium values turn out positive, as required. In order to preserve the exact 
values of P;' and P to be used in the further calculation of OF and Q%, it is advisable to 
express them as fractions rather than decimals. 

Could we have obtained the equilibrium prices graphically? The answer is yes, From 
(3.13), it is clear that a two-commodity model can be summarized by two equations in two 
vatiables P; and P,, With known numerical coefficients, both equations can be plotted in the 
P, P, coordinate plane, and the intersection of the two curves will then pinpoint Py and P,*. 


n-Commodity Case 

The previous discussion of the multicommodity market has been limited to the case of 
two commodities, but it should be apparent that we are already moving from partial- 
equilibrium analysis in the direction of general-equilibrium analysis. As more commoditics 
enter into a model, there will be more variables and more equations, and the equations will 
get longer and more complicated. If all the commodities in an economy are included in a 
comprehensive market model, the result will be a Walrasian type of general-equilibrium 
model, in which the excess demand for every commodity is considered to be a function of 
the prices of all the commodities in the economy. 

Some of the prices may, of course, carry zero coefficients when they play no role in the 
determination of the excess demand of a particular commodity; e.g., in the excess-demand 
function of pianos the price of popcorn may well have a zero coefficient. In general, how- 
ever, with » commoditics in all, we may express the demand and supply functions as 
follows (using Qy; and Q,; as function symbols in place of fand g): 


Qui = Das(Pis Pay ves Pad 
Osi = Osi (Pis Pay 0 Pad 


Jn view of the index subscript, these two equations represent the totality of the 2a functions 
which the model contains. (These functions arc not necessarily linear.) Moreover, the equi- 
librium condition is itself composed of a sct of n equations, 


C2 -O¢=9 (= 1,2,...,0) (3.18) 


When (3.18) is added to (3.17), the model becomes complete, You should therefore count a 
total of 3n equations. 

Upon substitution of (3.17) into (3.18), however, the model can be reduced to a set of » 
simultaneous equations only: 


OailPr, Pa... Pa) -— Qik Pi, Pa, 00. Py =O (= 1,2,...,#) 


Besides, inasmuch as £; = Qu: — Qs, where E; is necessarily also a function of all the # 
prices, the latter set of equations may be written alternatively as 


EAP, Py,...,P))=0  (§=1,2,...,”) 


G=12.0.,9 GAD 


AA Part Two Static (or Fquilibyium) Analysis 


Solved simultaneously, these # equations can determine the » equilibrium prices P;*—if 
a solution does indeed exist, And then the QF may be derived from the demand or supply 
functions, 


Solution of a General-Equation System 
Ifamodel comes equipped with numerical coefficients, as in (3.16), the equilibrium values of 
the variables will be in numerical terms, too. On a more general level, ifa model is expressed 
in terms of parametric constants, as in (3.12), the equilibrium values will also involve param- 
eters and will hence appear as “formulas,” as exemplified by (3.14) and (3.15). If, for greater 
generality, even the function forms are left unspecified in a model, however, as in (3.17}, the 
manner of expressing the solution values will of necessity be exceedingly general as well. 
Drawing upon our experience in parametric models, we know that a solution value is al- 
ways an expression in terms of the parameters. For a general-function model containing, 
say, a total of m parameters (a1, @2, ..., @n.)—where m is not necessarily cqual to -the n 
equilibrium prices can be expected to take the general analytical form of 


PY = PMay,a2,-.54m) (6 = 1,2,...57) 3.19) 


This is a symbolic statement to the effect that the solution value of cach variable (here, 
price) is a function of the set of all parameters of the model. As this is a very general state- 
ment, it really does not give much detailed information about the solution. But in the gen- 
eral analytical treatment of some types of problem, even this seemingly uninformative way 
of expressing a solution will prove of use, as will be seen in Chap. 8. 

Writing such a solution is an easy task. But an important catch exists: the expression in 
(3,19) can be justified if and only if a unique solution does indeed exist. for then and only 
then can we map the ordered m-tuple (a), 42, ...,4,) into a determinate value for each 
price P*. Yet, unfortunately for us, there is no a priori reason to presume that every model 
will automatically yield a unique solution, In this connection, it needs to be emphasized 
that the process of “counting equations and unknowns” does not suffice as a test. Some 
very simple cxamples should convince us that an equal number of equations and unknowns 
{endogenous variables) does not necessarily guarantee the existence of a unique solution. 

Consider the three simultaneous-equation systems 







x+ y=8 
“ (3.20) 

X+ y=9 

2a =12 
uy (3.21) 

4x + 2y = 24 

2x +3y = 58 
18 (3.22) 





x+ y=20 


In (3.20), despite the fact that two unknowns are linked together by exactly two equations, 
there is nevertheless no solution, These two equations happen to be inconsistent, for if the 
sum of x and y is 8, it cannot possibly be 9 at the same time, In (3.21), another case of two 
equations in two variables, the two equations are functionally dependent, which means that 
one can be derived from (and is implied by) the other. (Here, the second equation is cqual 


Chapter 3 Equilibrium Analysis in Economics 45 


to two times the first equation.} Consequently, one equation is redundant and may be dropped 
from the system, leaving in cftect only one equation in two unknowns. The solution will then 
be the equation y = 12 — 2x, which yiclds not a unique ordered pair (x*, y*) but an infinite 
number of them, including (0, 12), (1, 10), (2, 8), etc., all of which satisfy that equation, 
Lastly, the case of (3.22) involves more equations than unknowns, yet the ordered pair (2, 18) 
does constitute the unique solution to it, The reason is that, in view of the existence of func- 
tional dependence among the equations (the first is equal to the second plus twice the third). 
we have in effect only two independent, consistent equations in two variables, 

These simple examples should suffice to convey the importance of consistency and func- 
tional independence as the two prerequisites or application of the process of counting 
equations and unknowns, In general, in order to apply that process, make sure that (1) the 
satisfaction of any one equation in the model will not preclude the satisfaction of another 
and (2) no equation is redundant. In (3.17), for example, the 2 demand and # supply func- 
tions may safely be assumed to be independent of one another, each being derived from a 
different sauree—each demand from the decisions of a group of consumers, and each sup- 
ply from the decisions of a group of firms. Thus each function serves to describe one facet 
of the market situation, and none is redundant. Mutual consistency may perhaps also be 
assumed. In addition, the equilibrium-condition equations in (3.18) are also independent 
and presumably consistent. Therefore the analytical solution as written in (3.19) can in 
general be considered justifiable.” 

For simultaneous-equation models, there exist systematic methods of testing the cxis- 
tence of a unique (or determinate) solution. These would invotve, for lincar models, an 
application of the concept of determinants, to be introduced in Chap. 5. In the case of non- 
linear models, such a test would also require a knowledge of so-called purtial d 
and a special type of determinant called the Jacobian determinant, which will be discussed 
in Chaps. 7 and 8 












EXERCISE 3.4 


1. Work out the step-by-step solution of (3.13), thereby verifying the results in (3,14) 
and (3.15). 
2. Rewrite (3.14) and (3.15) in terms of the original parameters of the moded in (3.12). 
3. The demand and supply functions of a two-commodity market model are as foliows: 
Qe =18-37P,+ Pp Qur 12+ Pi —2Py 
Qo = 2+ 4Pi Qs = 2 + 3P2 
Find P* and QF (i = 1, 2). (Use fractions rather than decimals.) 


¥ This is essentially the way that Léon Walras approached the problem of the existence of 

a general-market equilibrium, tn the modern literature, there can be found a number of 
sophisticated mathematical proofs of the existence of a competitive market equilibrium under 
certain postulated economic conditions. But the mathematics used is advanced. The easiest one 
to understand is perhaps the proof given in Robert Dorfman, Paul A. Samuelson, and Robert M. 
Solow, Linear Programming and Economic Analysis, McGraw-Hill Book Company, New York, 1958, 
Chap. 13. 


46 Part Two Static for Equilibrium) Analysis 


3.5 Equilibrium in National-Income Analysis 





Even though the discussion of static analysis has hitherto been restricted to market models 
in various guises—linear and nonlinear, one-commodity and multicommodity, specific and 
general - it, of course, has applications in other areas of economics also. As an example, 
we may cite the simplest Keynesian national-income model, 


Y=C+h+G@o 


C=aeh¥ {a>0, 0<4<1) (3.23) 


where ¥ and C stand for the endogenous variables national income and (planned) con- 
sumption expenditure, respectively, and fo and Gy represent the cxogenously determined 
investment and government cxpenditures. The first equation is an equilibrium condition 
(national income = total planned expenditure). The second, the consumption function, is 
behavioral. The two parameters in the consumption function, a and b, stand for the au- 
(onomous consumption expenditure and the marginal propensity to consume, respectively. 

1: is quite clear that these two equations in pwo endogenous variables are neither func- 
tionally dependent upon, nor inconsistent with, each other. Thus we would be able to find 
the equilibrium values of income and consumption expenditure, ¥* and C*, in terms of the 
parameters a and b and the exogenous variables /y and Go. 

Substitution of the second equation into the first will reduce (3.23) to a single equation 
in one variable, ¥: 


Yoath¥+lot+ Go 
or (1-6)¥ =a+h+Go {collecting terms involving Y) 


To find the solution value of Y (cquilibrium national income), we only have lo divide 
through by (1 - 6): 

h+G 

yo i tht Go (3.24) 

ld 
Note, again, that the solution valuc is expressed entirely in terms of the paramcters and ex- 
ogenous variables, the given data of the model. Putting (3.24) into the second equation of 
(3.23) will then yield the cquilibrium level of consumption expenditure: 





h 
Chaathy'eay Ot ly + Gu) 
1-6 
ab) t+ bat lot Go) a+ blo + Go} (3.25) 
~ 1-5 Tob , 


This is again expressed entirely in terms of the given data. 

Both ¥* and C* have the expression (1 — 6) in the denominator; thus a restriction b # | 
is necessary, to avoid division by zero. Since 4, the marginal propensity to consume. has been 
assumed to be a positive fraction, this restriction is automatically satisfied. For ¥* and C* to 
be positive, moreover, the numerators in (3.24) and (3.25) must be positive. Since the exoge- 
nous expenditures Zp and Go are normally positive, as is the parameter a (the vertical inter- 
cept of the consumption function), the sign of the numerator expressions will work out, too. 


Chapter 3 Lyuilibriian Anatvsis in Economics AF 


Asa check on our calculation, we can add the C* expression in (3.25) to (4y + Gp) and 
verify that the sum is equal to the ¥* expression in (3.24), 

This model is obviously one of extreme simplicity and crudity, but other models of 
national-income determination, in varying degrees of complexity and sophistication, can 
be constructed as well. In each case, however, the principles involved in the construction 
and analysis of the model are identical with those already discussed. For this reason, we 
shall not go into further illustrations here. A more comprehensive national-income model, 
involving the simultaneous equiltbrium of the money market and the goods market, will be 
discussed in Sec. 8.6. 








EXERCISE 3.5 
1. Given the following madel: 
Y=C+lp+Go 
C=a+biY-T) (@>0, G<b<1) [TF taxes] 
T=d+ty (d>0, O«<t<1)  [t income tax rate] 


(a) How many endogenous variables are there? 
(b) Find Y*, T*, and C*. 
2. Let the national-income model be: 


¥=C+io+G 
C=a-bY-h) (a>0, O<b<1) 
G=g¥ (Q<g<1) 


(a) Identify the endogenous variables. 

(b) Give the econamic meaning of the parameter g. 

{c) Find the equilibrium national income. 

(d) What restriction on the parameters is needed for a solution to exist? 
3. Find ¥* and C* from the following: 


Y=C+lo+Co 
C= 25+ 6y1? 
lo = 16 


Go=14 


Chapter 





Linear Models and 
Matrix Algebra 


For the one-commodity model (3.1), the solutions P* and Q* as expressed in (3.4) 
and (3.5), tespectively, are relatively simple, even though a number of parameters arc 
involved. As more and more commodities are incorporated into the model, such solution 
formulas quickly become cumbersome and unwieldy. That was why we had to resort to a 
little shorthand, even for the two-commodity case—in order that the solutions (3.14) 
and (3.15) can still be written in a relatively concise fashion. We did not attempt to tackle 
any three- or four-commodity models, even in the linear version, primarily because we did 
not yet have at our disposal a method suitable for handling a large system of simultancous 
equations. Such a method is found in matrix algebra, the subject of this chapter and the next. 

Matrix algebra can enable us to do many things. In the first place, it provides a compact 
way of writing an equation system, even an extremely large one. Second, it leads to a way 
of testing the existence of a solution by evaluation of a determinant—a concept closcly 
related to that of a matrix. Third, it gives a method of finding that solution (if it exists). 
Since equation systems are encountered not only in static analysis but also in comparative- 
static and dynamic analyses and in optimization problems, you will find ample application 
of matrix algebra in almost every chapter that is te follow. This is why it is desirable to in- 
troduce matrix algebra early. 

However, one slight catch is that matrix algebra is applicable only to /inear-cquation 
systems. How realistically linear equations can describe actual economic relationships de- 
pends, of course, on the nature of the relationships in question. In many cases, even if some 
sacrifice of realism is entailed by the assumption of lincarily, an assumed lincar relation- 
ship can produce a sufficiently closc approximation to an actual nonlinear relationship to 
warrant its usc. 

In other cases, while preserving the nonlinearity in the model, we can effect a transfor- 
mation of variables so as to obtain a linear relation to work with. For example, the nonlinear 
function 

yoax? 


can be readily transformed, by taking the logarithm on both sides, into the function 


logy = loga + Alogx 


Chapter4 Linear Models and Matrix Algebra 49 


which is linear in the two variables (log y) and (logx), (Logarithms will be discussed in 
more detail in Chap. 10.). More importantly, in many applications such as comparative- 
static analysis and optimization problems, discussed subsequently, although the original 
formulation of the economic model is nonlinear in nature, linear equation systems will 
emerge in the course of analysis. Thus the lincarity restriction is not nearly as restrictive as 
it may first appear. 


4.1 Matrices and Vectors 








The two-commodity market model (3.12) can be written—afer eliminating the quantity 
variables as a system of two linear equations, as in (3.13’), 


cy Py +e P2 = —cy 
niPi+ v2Pe=—Y0 


where the parameters cg and yy appear to the right of the equals sign. In general, 2 system 
of m linear equations in » variables (x1, x2, ...,x,) can also be arranged into such a 
format: 


Myx, + ayak2 tet iat = A 
1X1 + A922 +--+ Gay ky = 2 


(4.1) 


Fy 1X1 + Gy rd2 +++ + Onn = Ay 


In (4.1), the variable x, appears only within the leftmost column, and in general the vari- 
able x, appears only in the jth column on the left side of the equals sign. The double- 
subscripted parameter symbol a;; represents the coefficient appearing in the ith equation 
and attached to the jth variable. For example, a; is the coefficient in the second equation, 
attached to the variable x). The parameter d; which is unattached to any variable, on the 
other hand, represents the constant term in the ith equation, For instance, d, is the constant 
term in the first equation. All subscripts are therefore keyed to the specific locations of the 
variables and parameters in (4.1). 


Matrices as Arrays 

There are essentially three types of ingredients in the equation system (4.1). The first is the 
set of coefficients a;;; the second is the set of variables.x), ...,.¥,; and the last is the set of 
constant terms d|, ..., d,. If we arrange the three sets as three rectangular arrays and label 
them, respectively, as 4, x, and @ (without subscripts), then we have 


a a2 tt Ay n dq 
se x2 d- 
a) ayy ay 2 
A= " x=|. d=| | (4.2) 


CF Xn dy 


50 Part Two Static (or Eyuilibrium) Analysis 


Asa simple example, given the linear-equation system 
Ox, + 3x2 + x3 = 22 
X + 4x) — 2x3 = 12 (4.3) 
4x, — x2 + 5x, = 10 


we can write 
6 3 1 aa) 22 
d=]1 4 -2 xe] d=| 12 (4.4) 
4-1 § X3 10 


Each of the three arrays in (4.2) or (4.4) constitutes a matrix, 

A matrix is defined as a rectangular array of numbers, parameters, or variables. The 
members of the array, referred to as the elements of the matrix, are usually enclosed in 
brackets, as in (4.2), or sometimes in parentheses or with double vertical lines: |{ ||. Note 
that in matrix 4 (the coefficient matrix of the equation system), the elements are separated 
not by commas but by blank spaces only. As a shorthand device, the array in matrix 4 can 
be written more simply as 


7=4,2,....m 
A= {aj} G ) 
FH le 


Inasmuch as the location of each element in a matrix is unequivocally fixed by the sub- 
script, every matrix is an ordered set. 


Vectors as Special Matrices 

The number of rows and the number of columns in a matrix together define the dimension 
of the matrix. Since matrix 4 in (4.2) contains m rows and n columns, it is said to be of 
dimension m x n (read “m by n”). Jt is important to remember that the row number always 
precedes the column number; this is in line with the way the two subscripts in aj; are 
ordered. In the special case where m =n, the matrix is called a square matrix; thus the 
matrix A in (4.4} is a3 x 3 square matrix. 

Some matrices may contain only one column, such as x and d in (4.2) or (4.4). Such 
matrices are given the special name column vectors. In (4,2), the dimension of x is 7 x 1, 
and that of d is m x 1; in (4.4) both x and d are 3 x 1. 1f we arranged the variables x, in a 
horizontal array, though, there would result a | x #7 matrix, which is called a row vector. For 
notation purposes, a row vector is often distinguished from a column vector by the use of a 
primed symbol: 


Ko=[ xp re ty] 


You may observe that a vector (whether row or column) is merely an ordered #-tuple, and 
as such it may sometimes be interpreted as a point in an #-dimensional space. In turn, the 
m X # matrix A can be interpreted as an ordered set of m row vectors or as an ordered set 
of column vectors. These ideas will be followed up in Chap. 5. 

An issue of more immediate interest is how the matrix notation can enable us, as 
promiscd, to express an equation system in a compact way. With the matrices defined in 
(4.4), we can express the equation system (4.3) simply as 


Ax=d 


Chapter 4 Linear Models and Matrix Algebra 51 


In fact, if 4, x. and d are given the meanings in (4.2), then even the general-equalion 
system in (4,1} can be writen as dx =. The compactness of this notation is thus 
unmistakable 

However, the equation Ax = d prompts al least two questions, How do we multiply wo 
tatrices A and x? What is meant by the equality of Ax and d? Since matrices involve 
whole blocks of numbers, the familiar algebraic operations defined for single numbers are 
nat directly applicable, and there is a need for a new set of operational rules. 





EXERCISE 4.1 


1. Rewrite the market model (3.1) in the format of (4.1), and show that, if the three vari- 
ables are arranged in the order Qu, Qs, and P, the coefficient matrix will be 


1-1 0 
1 #0 6 
o 1 -d 


How would you write the vector of constants? 

2. Rewrite the market model (3.12) in the format of (4.1) with the variables arranged in 
the following order: Qui, Qsi, Quo, Qy2, Pi, Pa. Write out the coefficient matrix, the 
variable vector, and the constant vector, 

3. Can the market model (3.6) be rewritten in the format of (4.1)? Why? 

4. Rewrite the national-income model (3.23) in the format of (4.1), with Yas the first vari- 
able. Write out the coefficient matrix and the constant vector. 

5. Rewrite the nationalincome model of Exercise 3.5-1 in the format of (4.1), with the 
variables in. the order Y, T, and C. [Hint: Watch out for the multiplicative expression 
b¢Y — Ty in the consumption function] 


4.2 Matrix Operations 





Asa preliminary, let us first define the word equality. Two matrices A = [a;]and 8 = [hj] 
are said to be egua/ if'and only if they have the same dimension and have identical elements 
in the corresponding locations in the array, In other words, 4 = B if and only if a; = hs, 
for all values of i and j. Thus, for example, we find 


43 4 3 2 0 
[2 ol-[ alee 3] 
As another example, it] | = [2] , this will mean that x = 7 and y = 4. 


Addition and Subtraction of Matrices 

Two matrices can be added if and only if they have the same dimension. When this dimen- 
sional requirement is met, the matrices are said to be conformable for addition. In that case, 
the addition of 4 = [a);] and 8 = [6,,] is delined as the addition of each pair of corre- 
sponding elements, 


52 Part Two Static for Equilibrium) Analysix 


Example 1 


Example 2 


Example 3 


Example 4 


Example 5__ 


Example 6 


4.9],J2 0]_[442 940]_[6 9 
2 1]tlo 7]/=|2+0 147] [2 8 


Pan a2 bu bie mi+by m2 +b 
Gy) 22. + | bay B22 | =| Gay tbo Oza + baz 
L@31 O32 bs, b32 3) + Osi dy2 + b32 

In general, we may state the rule thus: 

fa]+(6)]=[cj] where cj = aj + bj 


Note that the sum matrix [¢;;] must have the same dimension as the component matrices 
aij] and [4;;]. 





The subtraction operation 4 — B can be similarly defined if and only if 4 and B have 
the same dimension. The operation entails the result 


Lars] — (big) = [aij] where dij = ayy — By 


19 3 6 8]_[19-6 3-8] [13 -5 
2 Oo] {1 3} [| 2-1 0-3] [| 1 -3 
The subtraction operation A — B may be considered alternatively as an addition operation 


involving a matrix A and another matrix (—1)8. This, however, raises the question of what 
is meant by the multiplication of a matrix by a single number (here, —1). 


Scalar Multiplication 


To multiply a matrix by a number—or in matrix-algebra terminology, by a scalar—is to 
multiply every element of that matrix by the given scalar. 


fo s}-[o 38] 


1 1 
AB a] 201 gh2 
2 | a21 a2 4021 3029 
From these examples, the rationale of the name scalar should become clear, for it “scales 


up (or down)” the matrix by a certain multiple. The scalar can, of course, be a negative 
number as well. 


_y{an a2 A] fan a2 a 
a a2 ch ay 022 —d 
Note that if the matrix on the left represents the coefficients and the constant terms in the 
simultaneous equations 
ax + a2k2 = ah 


Gy ky + Op2%2 = dh 


then multiplication by the scalar —1 will amount to multiplying both sides of both equa- 
tions by —1, thereby changing the sign of every term in the system. 


Chapter 4 Linear Models and Matrix Algebra 53. 


Multiplication of Matrices 

Whereas a scalar can be used to multiply a matrix of any dimension, the multiplication of 

two matrices is contingent upon the satisfaction of a different dimensional requirement. 
Suppose that, given two matrices A and B, we want to find the product 48. The 

conformability condition for multiplication is that the column dimension of 4 (the “lead” 

matrix in the expression 48) must be equal to the row dimension of 8 {the “lag”? matrix), 

For instance, if 


dy bin bis 

oty = [a are] obs) [ bap | 4.5) 
the product 4B then is defined, since 4 has pve columns and B has two rows ~-precisely the 
same number.’ This can be checked at a glance by comparing the second number in the 
dimension indicator for 4, which is (1 x 2), with the jirs¢ number in the dimension indica- 
for for B, (2 x 3). On the other hand, the reverse product BA is not defined in this casc, 
because B (now the lead matrix) has zhree columns while 4 (the lag matrix) has only one 
row; hence the conformability condition is violated. 

In general, if 4 is of dimension m x n and B is of dimension p x g, the matrix product 
AB will be defined if and only if n = p. Tf defined, moreover, the product matrix 4B will 
have the dimension m x q the same number of rows as the lead matrix 4 and the same 
number of cofwmns as the lag matrix B. For the matrices given in (4.5), AB will be | = 3. 

It remains to define the exact procedure of multiplication, For this purpose, let us take 
the matrices 4 and B in (4.5) for illustration. Since the product 4B is defined and is 
expected to be of dimension | x 3, we may write in general (using the symbol C rather than 
cc’ for the row vector) that 


AB=C=[ey cn cul 


Each element in the product matrix C, denoted by e;;, is defined as a sum of products, to be 
computed from the elements in the ith row of the lead matrix A, and those in the jth column 
of the lag matrix 8. To find c1,, for instance, we should take the first row in A (since # = 1) 
and the férsé column in B (since j = 1)-~as shown in the top panel of Fig. 4.1—and then 
pair the elements together sequentially, multiply out cach pair, and take the sum of the 
resulting products, to get 





1 = ah + aida (4.6) 
Similarly, for c)2, we take the first row in (since? = 1) and the second column in B (since 
7 = 2),and calculate the indicated sum of products -~-in accordance with the lower panel of 
Fig. 4.1—as follows: 

e2 = aha + aby (4.6) 
By the same token, we should also have 

13 = abs + ayy; (4.6”) 
* The matrix A, being a row vector, would normally be denoted by a’, We use the symbol A here to 


stress the fact that the multiplication rule being explained applies to matrices in general, not only to 
the product of one vector and one matrix. 


54 Part Two Static (or Equilibrium) Analysis 


FIGURE 4.1 


Example 7 


Example 8 


For ey: First pair 








Second pair 


For ¢,3 First pair 


Second pair 








iis the particular pairing requirement in this process which necessitates the matching of 
the column dimension of the lead matrix and the row dimension of the lag matrix before 
multiplication can be performed. 

The multiplication procedure illustrated in Fig. 4.1 can also be described by using the 
concept of the inner product of two vectors. Given two vectors w and v with n elements 
each, say, (2), ¥2,..+,Mn) and (vj, ¥2,..., U,), arranged either as two rows or as TWO 
columns or as one row and one column, their inner product, written as w- 1 (with a dot in 
the middle), is defined as 


HV SU, Hevea be bay 


This is a sum of products of corresponding clements, and hence the inner product of two 
vectors is a scalar. 





If, after a shopping trip, we arrange the quantities purchased of m goods as a row vector 
Q’={0 Q2 ++ Qu], and fist the prices of those goods in a price vector P’ = 
[P: Pz +++ Pp], then the inner product of these two vectors is 


Q’. Ph = Q)Pi + QaP2+-+- + QuP, = total purchase cost 


Using this concept, we can describe the element ¢;; in the product matrix C = 4B 
simply as the inner product of the ith row of the lead matrix 4 and the jth column of the lag 
matrix B. By examining Fig. 4.1, we can easily verify the validity of this description. 

The rule of multiplication just outlined applies with equal validity when the dimensions 
of A and B are other than those illustrated in Fig. 4.1; the only prerequisite is that the con- 
formability condition be met. 


Given 


13 5 
A =|2 8 and B= [ ] 
(3x2) 40 axuy LY 


Example 9 


Example 10 


Example 11 


Chapter 4 Linear Models and Matrix Algebra 55 


find AB. The product AB is indeed defined because A has two columns and 8 has two rows. 
Their product matrix should be 3 x 1, a column vector: 


1(5) + 3(9) 9 
AB =| 2(5) +809) -|2| 


4(5) + 0(9) 20 
Given 
} -1 2 9-3 4% 
A=|1 0 ] and B =/-1 : 4 
3x3 3x3) 
a» 14 9 2 8x3) ott 
5 10 


find AB, The same rule of multiplication now yields a very special product matrix: 


0+140 -2-1+2 §-4-4 
S7373 WW 100 
AB=]0+0+0 -2+0+8 $+0-4)/=|]0 1 0 
0+0+0 -$404+3 }% oo 1 
This last matrix—a square matrix with 1s in its principat diagonal (the diagonal running from 


northwest to southeast) and Qs everywhere else—exemplifies the important type of matrix 
known as the identity matrix. This will be further discussed in Section 4.5. 


Let us now take the matrix A and the vector x as defined in (4.4) and find Ax. The product 
matrix is a 3 x 1 column vector: 


6 3 1 xy 6x, + 3x2 + xz 
Ax=}10 4 -2]) a2 [=| 41 +4x2— 2x3 
4-1 S5)L% 4x — x2 +5x3 


x3) @x1) 6x1) 


Note: The product on the right is a colurnn vector, its corpulent appearance notwithstand- 
ing! When we write Ax = d, therefore, we have 


6x] + 3x + x3 22 
xX +4x2 ~ 2x3 | =| 12 
Ax — x2 +543 10 


which, according to the definition of matrix equality, is equivalent to the statement of the 
entire equation system in (4.3). 

Note that, to use the matrix notation Ax = d, it is necessary, because of the conforma- 
bility condition, to arrange the variables x; into a cofumn vector, even though these vari- 
ables are listed in a horizontal order in the original equation system. 


The simple national-income model in two endogenous variables ¥ and C, 
Y=C+lo+Go 
C=a+by 


$6. Part Two Static (or Equitibriuna) Analysis 


can be rearranged into the standard format of (4.1) as follows: 


¥-C=l9+Go 
—b¥ +C=a 


Hence the coefficient matrix 4, the vector of variables x, and the vector of constants d are 


_f 1 +1 _[fy _ | lot Go 
bool i| os [e] fl a | 


Let us verify that this given system can be expressed by the equation Ax = d, 
By the rule of matrix multiplication, we have 


mL allel Sereno |=[or=e] 


Thus the matrix equation Ax = d would give us 


¥-C ]_[h+Go 
—bY+C] 7 a 
Since matrix equality means the equality between corresponding elements, it is clear that 


the equation Ax = d does precisely represent the original equation system, as expressed in 
the (4.1) format. 


The Question of Division 

While tatrices, like numbers, can undergo the operations of addition, subtraction, and 
multiplication—subject to the conformability conditions—it is not possible 10 divide onc 
matrix by another. That is, we cannot write A/B. 

For two numbers a and 6, the quotient a /6 (with b 4 9) can be written alternatively as 
ab orb la, where | represents the inverse ot reciprocal of b. Since ab-! = 6-'a, the 
quotient expression a/b can be used lo represent both ab land ) 1a. The case of matrices 
is different. Applying the concept of inverses to matrices, we may in certain cases (dis- 
cussed in Sec. 4.6) define a matrix BW! that is the inverse of matrix B. But from the dis- 
cussion of the conformability condition it follows that, if 48 | is defined, there can be no 
assurance that 7! 4 is also defined. Even if 487! and BW! 4 are indeed both defined, they 
still may not represent the same product. Hence the expression 4/8 cannot be used with- 
out ambiguity, and it must be avoided. Instead, you musi specify whether you are referring 
to AB~! or B-! A—-provided that the inverse B~' docs exist and that the matrix product in 
question is defined. Inverse matrices will be further discussed in Sec. 4.6. 


The 3) Notation 
The use of subscripted symbols not only helps in designating the locations of parameters 
and variables but also lends itself to a flexible shorthand for denoting sums of terms, such 
as those which arose during the process of matrix multiplication. 

The summation shorthand makes use of the Greek letter © (sigma, for “sum”). To 
express the sum of x), xz, and x3, for instance, we may write 


3 
prt = yy 


isl 


Chapter 4 Linear Models and Mawix Algebra 57 


which is read as “the sum of x; as / ranges from 1 10 3.” The symbol j, called the summa- 
tion index, takes only integer values. The expression x; represents the summand (that which 
is to be summed), and it is in effect a function of j. Aside from the letter j, summation 
indices are also commonly denoted by i or 4, such as 


7 
yx =X3+X4+4X5 4+ Xe +47 
i=3 


t 
Vox =Xo FX +++ +kn 
4=0 
The application of }> notation can be readily extended to cases in which the x term is 
prefixed with a coefficient or in which each term in the sum is raised to some integer power. 
For instance, we may write: 


3 
Yoax, = aX) + 0x7 + xy = a(x + x2 403) = 
= 
3 
Say) S ax, + aaxy + a3x3 
Jol 





n 
Do aa! = cox’ + aye! + ane? + ax" 
i=0 


= ag tary + aga? $e tay” 


fn 
The last example, in particular, shows that the expression)» a;x! can in fact be used as a 


shorthand form of the general polynomial, function of 24), 

Tt may be mentioned in passing that, whenever the context of the discussion leaves no 
ambiguity as to the range of summation, the symbol 5} can be used alone, without an index 
attached (such as }~ x;), or with only the index letter underneath (such as 2 xi) 


Let us apply the }° shorthand to matrix multiplication. In (4.6), (4.6), and (4.6"), each 
element of the product matrix C = 48 is defined as a sum of terms, which may now be 
rewritten as follows: 


2 
en = aby + ai2b = Dauber 
fai 


12 = ay bi + ayaba, = ainda 


w TM 


613 = a bis + ai2ba3 = Dd andes 
k=l 
In each case, the first subscript of ¢,; is reflected in the first subseript of a1, and the see- 
ond subscript of c,, is reflected in the second subscript of bj; in the 5 expression. The 
index &, on the other hand, is a “dummy” subscript; it serves to indicate which particular 
pair of elements is being multiplied, but it does not show up in the symbol c,. 


58 Part Two Static for Equilibrium) Analysis 


B 


Extending this to the multiplication of an m x # matrix A = [aa] and ana x p matnx 
= [b4;], we may now write the clements of the m x p product matrix AB = C = [ci] as 


” 
en =) aub on =) dude 
tel 





or more generally, 





This last equation represents yet another way of stating the rule of multiplication for the 
matrices defined above. 





EXERCISE 4.2 
1 


- 741 0 4 8 3 ; 
Given b= [7 a 8=[3 syanac= i 1] tne 


(A+B (BC-A (3A (a) 4B 420 


§ 1 


(a) 4s AB defined? Calculate AB, Can you calculate BA? Why? 
(b) Is BC defined? Catculate BC. Is CB defined? If so, calculate CB, Is it true that BC = CB? 


28 
. 2 0 7 2 
Gon k= o|.e-[5 ghandc=|f 3) 


. On the basis of the matrices given in Example 9, is the product BA defined? if so, 


calculate the product. In this case do we have AB = 8A? 


. Find the product matrices in the foltowing (in each case, append beneath every matrix 


a dimension indicator): 


02 0|/8 0 x 
03 0 4 E | ) [; 3 4p] 
23 0)L3 5 z 


4 -1 7 0 
|} ; “A; | @la b ala | 
01 14 


. In Example 7, if we arrange the quantities and prices as column vectors instead of row 


vectors, is Q- P defined? Can we express the total purchase cost as Q. P? As Q’- P? As 
QP? 


. Expand the following summation expresstons: 


5 fl 
co) Bax” 
w Yan @EGsH 

isS is 


‘ 
() E bx 
ind 


Chapter 4 Linear Models and Matrix Algebra 59 


7, Rewrite the following in 3° notation: 
{a} x1 (41 — 1) + 2x2(x2 — 1) + 3xafx3 — 1) 
(by aalxa + 2) + aa(x4 + 3) + ag{xs +4) 


11 1 
© yt gtot ye (x0) 


1 4 1 . 
Ot ot gtong (x € 0) 


8. Show that the following are true: 
a re) 
() (= x) +n = Si 
i=0 =o 


a a 
(b) 2, aby = 0 5. by 
EB B 


© E@+weabyrdy 
j=l j=l it 


4.3 Notes on Vector Operations 





Example 1 


Example 2 


In Secs. 4.1 and 4.2, vectors are considered as a special type of matrix, As such, they qual- 
ify for the application of all the algebraic operations discussed. Owing to their dimensional 
peculiarities, however, some additional comments on vector operations are useful. 


Multiplication of Vectors 
An mx 1 column vector vt, and a | x @ row vector v’, yield a product matrix we’ of 
dimension m x 1. 


Gven u=| 3 andv'=[1 4 5], we can get 
v= [nn 3(4) 3(5)]_[3 12 15 
2(1) 2(4) 291712 38 vol 


Since each row in u consists of one element only, as does each column in v’, each element 
of uv’ turns out to be a single product instead of a sum of products. The product uv’ is a 
2x 3 matrix, even though what we started out with are a pair of vectors. 


On the other hand, given a | x @ row vector uv’ and ann x | column vector v, the prod- 
uct uu will be of dimension | x 1. 
Given vw’ =[3 4] andv= [7]. we have 
u'v = [3(9) + 4(7)] = [55] 
As written, v/v is a matrix, despite the fact that only a single element is present, However, 


1x | matrices behave exactly like scalars with respect to addition and multiplication: 
£4] + [8] = [12], just as 4 + 8 = 12: and [3] [7] = [21], just as 3(7) = 21. Moreover, 1 x | 


60 Part Two Static (or Equilibrium) Analysis 


Example 3 


matrices possess no major properties that scalars da not have. In fact, there is a one-to-one 
correspondence between the set of all scalars and the set of all 1 x 1 matrices whose cle- 
ments are scalars, For this reason, we may redefine u'v to be the scalar corresponding to the 
1 x | product matrix. For Example 2, we can accordingly write u'v = 55. Such a product is 
called a scalar product.t Remember, however, that while a 1 x 1 matrix can be treated as a 
scalar, a scalar cannot be replaced by a 1 x 1 matrix at will if further calculation is to be 
carried out, because complications regarding conformability conditions may arise. 


Given a row vector u’=[3 6 9], find wu, Since u is merely the column vector with the 
elements of u’ arranged vertically, we have 


3 
vu=[3 6 9]] 6 | =G¥ +6" +07 
9 
where we have omitted the brackets from the 1 x 1 product matrix on the right. Note that 
the product u‘u gives the sum of squares of the elements of u. 
Ingeneral, ify’ =[41 uz --- un], then uy will be the sum of squares (a scalar) of the 
elements uj: 
4 
du=ut+bt- tea Sue 
1 
Had we calculated the inner product u-u (or u’- u’), we would have, of course, obtained 
exactly the same result. 


To conclude, it is important to distinguish between the meanings of wv’ (a matrix larger 
than 1 x 1) and wv (a Ux | matrix, or a scalar). Observe, in particular, that a scalar 
product must have a row vector as the lead matrix and a colunen vector as the lag matrix; 
otherwise the product cannot be 1 x 1. 


Geometric Interpretation of Vector Operations 

It was mentioned earlier that a column or row vector with » clements (referred to hereafter 
as an n-vecfor) can be viewed as an #-tuple, and hence as a point in an n-dimensional space 
(referred to hereafter as an n-space). Let us elaborate on this idea. In Fig. 4.2a, a point (3, 2) 
is plotted in a 2-space and is labeled u. This is the geometric counterpart of the vector 
w= [; or the vector w’ =[3 2], both of which indicate in this context one and the 
same ordered pair. If an arrow (a directed-line segment) is drawn from the point of origin 
(0, 0) to the point u, it will specify the unique straight route by which to reach the destina- 
tion point # from the point of origin. Since a unique arrow exists for cach point, we can 
regard the vector # as graphically represented either by the point (3, 2), or by the corre- 
sponding arrow. Such an arrow, which emanates from the origin (0, 0) like the hand of a 
clock, with a definite length and a definite direction, is called a radius vector 


+The concept ef scalar product is thus akin to the concept of inner product of two vectors with the 
same number of elements in each, which also yields a scalar. Recall, however, that the inner product is 
exempted from the conformability condition for multiplication, so that we may write it as w- v. In the 
case of scalar product (denoted without a dot between the two vector symbols), on the other hand, 
we can express it only as a row vector multiplied by a column vector, with the row vector in the lead. 


Chapter 4 Linear Mudels and Marix Algebra 61 


FIGURE 4.2 














{c) (a) 


Following this new interpretation of a vector, it becomes possible to give geometric 
meanings to (a) the scalar multiplication of a vector, (4) the addition and subtraction of vec- 
tors, and more generally, (c) the so-called linear combination of vectors. 


First, if we plot the vector y = 2u in Fig, 4.2a, the resulting arrow will overlap the 


old one but will be twice as long, In fact, the multiplication of vector z by any scalar k will 
produce an overlapping arrow, but the arrowhead will be relocated, unless k = 1. If the 
scalar multiplicr is k > 1, the arrow will be extended out (scaled up); if 0 < & < 1, the 
arrow will be shortened (scaled down}, if = 0, the arrow will shrink into the point of 


origin—which represents a nul? vector, ’ . A negative scalar multiplier will even reverse 
the direction of the arrow. If the vector z is multiplied by —1, for instance, we get —w = 
3 , and this plots in Fig. 4.24 as an arrow of the same length as » but diametrically 
Opposite in direction. 
Next, consider the addition of two vectors, v = [4 andy = [>] - The sum vy +a = 
[é | can be directly plotted as the broken arrow in Fig. 4.2c. If we construct a parallelogram 


with the two vectors u and v (solid arrows) as two of its sides, however, the diagonal of the 


62 Part Two Static (ur Equilibrium) Analysis 


Example 4 


Example 5 


parallelogram will tura out exactly to be the arrow representing the vector sum v +1. In 
general, a vector sum can be obtained geometrically from a parallelogram. Moreover, this 
method can also give us the vector difference v — u, since the latter is equivalent to the see 
of » and (—1)u, In Fig. 4.2d, we first reproduce the vector » and the negative vector —v 
from diagrams c and &, respectively, and then construct a parallelogram. The resulting 
diagonal represents the vector difference v — wu. 

It takes only a simple extension of these results to interpret geometrically a linear 
combination (i.¢., a linear sum or difference) of vectors. Consider the simple case of 


wn ef 


The scalar multiplication aspect of this operation involves the relocation of the respective 
arrowheads of the two vectors v and u, and the addition aspect calls for the construction of 
a parallclogram. Beyond these two basic graphical operations, there is nothing new in a lin- 
ear combination of vectors, This is true even if there are more terms in the linear combina- 
tion, as in 
fn 

Skin; = ky thaws te + katy 

i=l 
where k; are a sct of scalars but the subscripted symbols u; now denote a set of vectors. To 
form this sum, the first two terms may be added first, and then the resulting sum is added to 
the third, and so forth, till all terms are included, 


Linear Dependence 

A set of vectors 1, ..., U, is said to be Hinearly dependent if (and only if) any one of them 
can be expressed as a linear combination of the remaining vectors; otherwise they arc 
linearly independent. 


1 4 
The three vectors v) = (*]. v2 = [3] ,and ¥3 = (§] are linearly dependent because v3 


is a linear combination of v; and v2: 


w-me[s]-Lel-E 


Note that this last equation [s alternatively expressible as 
3y, —2v. -v¥3=0 


where 0 = [3 represents a null vector (also called the zero vector). 


The two row vectors ¥; =[5 12] and ¥,=[10 24] are linearly dependent because 
2v,=2[5 12]=[10 24] =¥5 


The fact that one vector is a multiple of another vector illustrates the simplest case of linear 
combination. Note again that this last equation may be written equivalently as 


av, — v2 =0" 


where 0' represents the null row vector[0 0}. 


Chapter 4 Linear Models and Mearix Afyebra 63 


With the introduction of null vectors, limear dependence may be redefined as follows. A 
set of m-vectors v1, ..., U, is linearfy dependent if and only if there exists a set of scalars 
k,,...,&» {not all zero) such that 


If this equation can be satisfied o#ly when 4; = 0 for all 7, on the other hand, these vectors 
are linearly independent, 

The concept of linear dependence admits of an easy geometric interpretation also. Two 
vectors # and 2u—one being a multiple of the other—are obviously dependent. Geometri- 
cally, in Fig. 4.2a, their arrows lic on a single straight line. The same is true of the two 
dependent vectors w and —u in Fig. 4.2, In contrast, the two vectors u and v of Fig. 4.2c: 
are linearly independent, because it is impossible to express one as a multiple of the other. 
Geometrically, their atrows do not lic on a single straight line. 

When more than two vectors in the 2-space are considered, there emerges this significant 
conclusion: onec we have found two linearly independent vectors in the 2-space (say, wand v), 
all the other vectors in that space will be expressible as a linear combination of these (wand v). 
InFig. 4.2¢ and d, ithas already been illustrated how the two simple linear combinations v + 
and v — u can be found. Furthermore, by extending, shortening, and reversing the given vec- 
tors # and v and then combining these into various parailelograms, we can generate an infinite 
number of new vectors, which will exhaust the set of all 2- vectors. Because of this, any set of 
three or more 2-vectors (three or more vectors in a 2-space) must be lincarly dependent. Two 
of them can be independent, but then the third must be a linear combination of the first two. 





Vector Space 
The totality of the 2-vectors generated by the various linear combinations of two indcpen- 
dent vectors u and » constitutes the two-dimensional vector space. Since we are dealing 
only with vectors with real-valued elements, this vector space is none other than R? the 
2-space we have been referring to all along. The 2-space cannot be generated by a single 
2-vector, because linear combinations of the latter can only give rise to the set of vectors 
lying on a single straight line. Nor does the generation of the 2-space require more than two 
linearly independent 2-vectors—at any rate, it would be impossible to find more than two. 

The two linearly independent vectors w and v are said to span the 2-space. They are also 
said te constitute a basis for the 2-space. Note that we said a basis, nat tfe basis, because 
any pair of 2-vectors can serve in that capacity as long as they are linearly independent. In 
particular, consider the two vectors [1 O] and [0 1], which are called wnéf vectors. The 
first one plots as an arrow lying along the horizontal axis, and the second, an arrow lying 
along the vertical axis, Because they are linearly independent, they can serve as a basis for 
the 2-space, and we do in fact ordinarily think of the 2-space as spanned by its two axes, 
which are nothing but the extended versions of the two unit vectors. 

By analogy, the three-dimensional vector space is the totality of 3-vectors, and it must 
be spanned by exactly three linearly independent 3-vectors. As an illustration, consider the 
set of three unit vectors 


1 0 0 
e=/0 e=]1 e=|0 (4.7) 
0 0 1 


64 Part Two Static (or Liguilibriun) Analysis 


FIGURE 4,3 





where each e; ig a vector with | as its ith element and with zeros elsewhere.” These three 
vectors are obviously lincarly independent; in fact, their arrows lie on the three axes of the 
3-space in Fig. 4.3. Thus they span the 3-space, which implies that the entire 3-space (R, 
1 
in our framework) can be generated from these unit vectors. For exansple, the vector | 2 
2 
can be considered as the Linear combination e, + 2e) + 2es;. Geometrically, we can first 
add the vectors e, and 2e) in Fig. 4.3 by the parallelogram method, in ordcr to get the vee- 
tor represented by the point (1, 2, 0) in the xy) plane, and then add the latter vector to 
2e,;—via the parallclogram constructed in the shaded vertical plane—to obtain the desired 
final result, at the point (1, 2, 2). 

The further extension to n-space should be obvious. The #-space can be defined as the 
totality of z-vectors. Though nongraphable, we can still think of the #-space as being 
spanned by a total of » (n-element) unit vectors that are all linearly independent. Each 
n-vector, being an ordered n-tuple, represents a point in the a-space, or an arrow extending 
from the point of origin (ie., the 1-clemeni null vector) to the said point. And any given sct 
of # linearly independent n-vectors is, im fact, capable of gencrating the entire n-space 
Since, in our discussion, each element of the n-vector is restricted to be a real number, this 
n-space is in fact R". 

The n-space we have referred to is sometimes more specifically called the Euclidean 
a-space (named after Euclid). To explain this latter concept, we must first comment briefly 
on the concept of distance between two vector points. For any pair of vector points # and v 
in a given space, the distance from u to v is some real-valucd function 


d =d(u,u) 





with the following properties: {1) when u and v coincide, the distance is zero; (2) when the 
two points are distinct, the distance from # to v and the distance from v to # are represented 


t The symbol e may be associated with the German word eins, for “one.” 


Chapter 4 Linear Vndels and Matrix Algebra 65 


by an identical positive real number: and (3) the distance between wu and v is never longer 
than the distance from w to w (a point distinct from # and v) plus the distance from w to vy. 
Expressed symbolically, 


d(u,v) =0 (for uw = v} 
d(u,v) = dv, u) > 0 (foru # v) 
d(u. v) = d(u, w) + dws vu) (for w 4 u,v) 


The last property is known as the #iangular inequality, because the three poinis u,v, and 
w together will usually define a triangle. 

When a vector space has a distance function defined that fulfills the previous three prop- 
erties, itis called a metric space. However, note that the distance d(zz, v} has been discussed 
only in general terms, Depending on the specific form assigned to the function, there may 
result a varicty of metric spaces, The so-called Euclidean space is one specific type of 
metric space, with a distance function defined as follows. Let point # be the n-tuple 
(a1, 42,...,@,) and point v be the ”-tuple (6). 42,...,6,); then the Euclidean distance 
function is 








du, 8) = V (ai — b))? + ay — YP + + (a, — YP 


where the square root is taken to be positive, As can be easily verified, this specific distance 
function satisfies all three properties previously enumerated. Applied to the two- 
dimensional space in Fig. 4.2@, the distance between the two points (6. 4) and (3, 2) is 
found to be 





V(@-39 + (4-22 = VP 42 = VB 


This tesult is seen to be consistent with Pythagoras’ theorem, which states that the length 
of the hypotenuse of a right-angied triangte is cqual to the (positive) square root of the sum 
of the squares of the lengths of the other two sides. For if we take (6, 4) and (3, 2) to be w 
and », and plot a new point w at (6, 2), we shall indeed have a right-angled triangle with the 
lengths of its horizontal and vertical sides equal to 3 and 2, respectively, and the length of 
the hypotenuse (the distance between » and v) cqual to 32 + 22 = 13, 

The Euclidean distance function can also be expressed in terms of the square root of 
a scalar product of two vectors. Since v and v denote the two a#-tuples (a)....,4,) and 
{b,...,,), We can write a column vector x — v, with clements a; — 5), a2 — 43,,.., 
ay — &,. What goes under the square-root sign in the Euclidean distance function is, of 
course, simply the sum of squares of these 7 elements, which, in view of Example 3 of this 
section, can be written as the scalar product (wu — v)'(w — v). Hence we have 


d(u,v) = yu = v)'(u =v) 





EXERCISE 4.3 
1, Given’ =[5 1 3, =[3 1 -1,w=[7 5 8],andx' [x x2 x5], write 
out the column vectors, u,v, w, and x, and find 
(a) uv’ (0) xx (uv (g) wu 
(b) uw! (dj vu Cf) wx (h) xx 


66 Part Two Static for Lyuilibrium) Analysis 


3 
2. Givenw=| 2 x= [tf v=[f] and e=[3 |: 
16 x2 ye 22 


(a) Which of the following are defined: w'x, x’y’, xy, y'y, 22", yw’, x ¥? 
(b) Find ail the products that are defined. 
3. Having sold n items of merchandise at quantities Qy,.-., Qp and prices P1,..., Pa, 
how would you express the total revenue in (a) }° notation and (b) vector notation? 
4. Given two nonzero vectors w1 and w2, the angle # (0° <8 < 180°) they form is related 
to the scalar product 4 w2 (= ww) as follows: 


acute 
6 is a(n) | fight angle if and only if ww 
obtuse 





AY 
° 


Verify this by computing the scalar product for each of the following pair of vectors (see 
Figs. 4.2 and 4.3): 


eno] om-[i-[9 


ol] wll 
omen [] 


2 
5. Given v= [; | and v= [s ): find the following graphically: 





{a) 2v (Ou-v (e) 2u+3v 
(b) u~v (A vau (f) 4u— 2v 


6. Since the 3-space is spanned by the three unit vectors defined in (4.7), any other 
3-vector should be expressible as a linear combination of e,, ez, and 3. Show that the 


following 3-vectors can be sq expressed: 
+1 2 
@] 6 () | 0 
9 8 


4 25 
(a) | 7 {b) | —2 
0 1 


7. In the three-dimensional Euclidean space, what is the.distance between the following 
points? 

{0} (3, 2, 8) and (0, -7, 5) (b) (9, 0, 4) and (2, 0, -4) 

8. The triangular inequality is written with the weak inequality sign <, rather than the 
strict inequality sign <, Under what circumstances would the “=” part of the inequal- 
ity apply? : 

9. Express the length of.a radius vector v in the Euclidean n-space (Le., the distance from 
the origin to point v) by using each of the following: 

(a) scalars (b) a scalar. product {O an inner product 


Chapter 4 Linear Models and Matrix Algebra 67 


4.4 Commutative, Associative, and Distributive Laws 





Example 1 


Example 2 


In ordinary scalar algebra, the additive and multiplicative operations obey the commuta- 
tive, associative, and distributive laws as follows: 


Commutative law of addition: atb=bt+a 
Commutative law of multiplication: ab = ba 
Associative law of addition: (a+ 6)+cesa+{b+ce) 
Associative law of multiplication: a(be} 


Distributive law: ab+ac 





These have been referred to during the discussion of the similarly named laws applicable to 
the union and intersection of sets. Most, but not all, of these laws also apply to matrix 
operations—the significant exception being the commutative law of multiplication. 


Matrix Addition 
Matrix addition is commutative as weil as associative. This follows from the fact that ma- 
trix addition calls only for the addition of the corresponding elements of two matrices, and 
that the order in which cach pair of corresponding elements is added is immaterial. In this 
context, incidentally, the subtraction operation 4 — 8 can simply be regarded as the addi- 
tion operation A + (—B), and thus no separate discussion is necessary. 

The commutative and associative laws can be stated as follows: 


Commutative law A+B=B+A 
PROOF A+B = [a;;]+ [i] = lay + by] =[by tay] = B+ A 


. 391 62 
civen 4 =| 4 and 2 =| 4 |e fn tha 


Asesa=[3 ;| 


36 
Associative law {A+ B)4+C=A4(8+0) 
PROOF (A+ B)4C = [ay + b)]) + [ey) = [ai + 4% + ee] 


= [a] + [by +e] = A+(B+0) 


Given = [a] [7]. ane vam [5]. fn that 
women SEEELE 
w+ (2 = ¥3) = [a]+[ 3] -["9] 


Applied to the linear combination of vectors ki v1 +--+ + Kn¥n, the associative law per- 
mits us to select any pair of terms for addition (or subtraction) first, instead of having to fol- 
low the sequence in which the n terms are listed. 


which is equal to 


68 Part Two 


Example 3 


Example 4 


Example 5 


Static (or Lquilibrium) Analysis 


Matrix Multiplication 
Matrix multiplication is not commutative, that is, 
ABZBA 


As explained previously, even when 4B is defined, BA may not be, but even if both prod- 
ucts are defined, the general rule is still AB 4 BA. 


tt A=|} ‘| and e-[§ fete 
elo eae) acy rat L242] 


O(1)—1€3) O(2) - 1(4) 3-4 
but BA= = 
6(1)+7(3) 6(2)+ 7(4) 27 40 
Let u be 1 x 3 (a row vector); then the corresponding column vector u must be 3 x 1. The 
product u’u will be 1 x 1, but the product vu’ will be 3 x 3, Thus, obviously, wy 4 uu’. 


In view of the general rule 4B 4 BA, the terms prenndtiply and postmultiply are often 
used to specify the order of multiplication. In the product 48, the matrix B is said to be 
premultiplied by 4, and 4 to be pastmultiplied by 8. 

There do exist interesting exceptions to the rule 48 #4 BA, however. One such case is 
when 4 is a square matrix and B is an identity matrix. Another is when 4 is the inverse of 
B, that is, when A = B7'. Both of these will be taken up again later. It should also be 
remarked here that the scalar multiplication of a matrix does obey the commutative law; 
thus, if kis a scalar, then 


kA = Ak 
Although it is not in general commutative, matrix multiplication is associative. 
Associative law (AB)C = A(BC) = ABC 


In forming the product 48C, the conformability condition must naturally be satisfied by 
each adjacent pair of matrices. If d ism x n andifC is p x q, then conformability requires 
that Bben x p: A BC 

(nar) (<p) (px4) 
Note the dual appearance of ” and p in the dimension indicators. If the conformability con- 
dition is met, the associative law states that any adjacent pair of matrices may be multiplied 
out first, provided that the product is duly inscried in the exact place of the original pair. 


tx=[2] and a=[4) 8 fetes 
x2 0 ay 


aX) 


Ax = x(AD = 
ax =x) =n ba 


| = ax? + 022%3 


Exactly the same result comes from 
: x 2 
(x'A)x = [aim anv] * | = 1x7 + apax} 


Example 6 


Chapter 4 Linear Models and Matrix Algebra 69 


In Exampte 5, the square matrix A has nonzero elements a), and ay) in the principal 
diagonal, and zeros everywhere else. Such a matrix is called a diagonal matrix. When a 
diagonal matrix 4 appears in the product x’ 4x, the resulting product gives a “weighted” 
sum of squares, the weights for the x and the xj terms being supplied by the elements in 
the diagonal of 4, This result is in contrast to the scatar product x'x, which yiclds a simple 
(unweighted) sum of squares. 


Let the economic ideal be defined as the national-income tevel ¥° coupled with the inflation 
rate p°. And suppose that we view any positive deviation of the actual income Y from ¥° to 
be equally undesirable as a negative deviation of the same magnitude, and similarly for 
deviations of the actual inflation rate p from p®. Then we may write a social-loss function 
such as 


Asa(y—¥°)? + a(p— py? 


where @ and f are the weights assigned to the two sources of social loss. If deviations of ¥ 
are considered to be the more serious type of loss, then a should exceed 4. Note that the 
Squaring of the deviations procluces two effects. First, upon squaring, a positive deviation 
will receive the same loss value as a negative deviation of the same numerical magnitude, 
Second, squaring causes the larger deviations to show up much more significantly in the 
social-loss measure than minor deviations. Such a social-loss function can be expressed, if 
desired, by the matrix product 


a o}fr-y¥? 
(y-¥e o- 1/5 ace 


Matrix multiplication is also distributive. 
Distributive law 4(8 +C) = AB+ AC [premultiplication by 4] 
(B+C)A=BA+CA [postmultiplication by 4] 


In each case, the conformability conditions for addition as well as for multiplication must, 
of course, be observed. 





EXERCISE 4.4 


1, Given A= [? ‘]- Bm [3 4) anac=[} 9 very tha 
(a) (A+ BY4C = A+(B+C) 

(b) (A+ BY C = AL(B-O) 

2. The subtraction of a matrix 8 may be considered as the addition of the matrix (-1)B. 
Does the commutative law of addition permit us to state that A- 8 = & — A? If not, 
how would you correct the statement? 

3, Test the associative law of multiplication with the following matrices: 


10 
§ 3 -8 0 7 
a= | a-[ 13 | c-[2 | 


70 Part Two Static fur Equilibriuny Analusis 


4. Prove that for any two scalars g and k 

(a) K(A+ B) = kKA+KB 

(b) (g+ A= gA+kA 

(Note: To prove a result, you cannot use specific examples.) 
5. For (a) through (d) find € = AB. 


of) BE 
[47 385 
A=), r] a-[3 6 | 


7 
124 5 
(j A=] 2 ” e-["3 6 ] 





L106 
r 10 1 
(A= § ; A Beli 3 
i. 29 
{e) Find () C = AB, and (ii) D = BA, if 
[ -2 
A=| 4 B=[3 6 -2] 
L 7 


6. Prove that (A+ 8)(C + D) = AC~ AD+ BC +80. 

7. if the matrix A in Example 5 had all its four elements nonzero, would xAx still give a 
weighted sum of squares? Weuld the associative law still apply? 

8. Name some situations or contexts where the notion of a weighted or unweighted sum 
of squares may be relevant. 


4.5 Identity Matrices and Null Matrices 





Identity Matrices 

We have referred carlier to the term identity matrix. Such a matrix is defined as a square 
(repeat: square) matrix with Is in its principal diagonal and 0s everywhere else. It is de- 
noted by the symbol /, or /,, in which the subscript # serves to indicate its row (as well as 
column) dimension, Thus, 


100 

1 0 
n= | h=|0 1 0 
001 


But both of these can also be denoted by /. 

The importance of this special type of matrix lies in the fact that it plays a role similar 
to that of the number | in scalar algebra, For any number a, we have I(a) = a( 1) = a. Sim- 
ilarly, for any matrix 4, we have 


IA= AP =A (48) 


Example 1 


Chapter 4 Linear Models and Matrix Algebra 71 


_f1 23 
wet a=|) 0 3 | ther 


1 ovf1 2 3]7_f1 23 
a=) [3 0 3-[3 0 3-4 


100 
123 123 
r=] | 010 -| |-4 
2 0 3 E 01 203 


Because Ais 2 x 3, premultiplication and postmultiplication of A by {would call for identity 
matrices of different dimensions, namely, #2 and /3, respectively. But in case Ais n x n, then 
the same identity matrix /, can be used, so that (4.8) becomes f,4 = Aln, thus illustrating 
an exception to the rule that matrix multiplication is not commutative. 


The special nature of identity matrices makes it possible, during the multiplication 
process, to insert or delete an identity matrix without affecting the matrix product. This 
follows directly from (4.8). Recalling the associative law, we have, for instance, 

AT B =(AIB= A B 
(mean) (ney (Hep) {reXn) {2% p) 
which shows that the presence or absence of J does not affect the product. Observe that 
dimension conformability is preserved whether or not / appears in the product. 
An interesting case of (4.8) occurs when 4 = /,, for then we have 


Ah =, =I, 
which states that an identity matrix squared is equal to itself. A generalization of this result 
is that 
hah a1.) 


An identity matrix remains unchanged when it is multiplied by itself any number of times. 
Any matrix with such a property (namely, 4d = A) is referred to as an idempotent matrix. 


Null Matrices 

Just as an identity matrix / plays the role of the number |, a nud/ matrix—or zero matrix— 
denoted by 0, plays the role of the number 0. A null matrix is simply a matrix whose 
elements are all zero. Unlike /, the zero matrix is not restricted to being square. Thus it is 


possible to write 
0.90 060 ¢ 
= and = 
aay [: 0] aly [° 0 0] 


and so forth. A square null matrix is idempotent, but a nonsquare one is not. (Why?) 
As the counterpart of the number 0, null matrices obey the folowing rules of operation 
(subject to conformability} with regard to addition and multiplication: 
4+ 0=0+42= 4 


(mxnb (xn) (axe) fmxn)  fmxay 


A 0 = 0 and 0 A= 0 


(mxmaxp) — iaxp) igen) Onsen) (gen) 


72 Part Two 


Example 2 


Example 3 


Suatic for éiguilibriam} Anabysis 


Note that, in multiplication, the null matrix to the left of the equals sign and the one to the 
right may be of different dimensions. 


ar az 00 M1 a2 
A40= = =A 
[@ 2] + [3 ol [% 22] 


9 
a1 m2 3 a 
A 0 = =|[}= 0 
(2x3) @x1) [3 22 a2] 3] [2] (x1) 
To the left, the null matrix isa 3 x 1 null vector; to the right, it is a 2 x 7 null vector. 


Idiosyncrasies of Matrix Algebra 
Despite the apparent similaritics between matrix algebra and scalar algcbra, the case of 
matrices does display certain idiosyncrasies that serve to warn us not to “borrow” from 
scalar algebra too unquestioningly. We have alrcady seen that, in general, AB BA in 
matrix algebra. Let us look at two more such idiosyncrasies of matrix algebra. 

For one thing, in the case of scalars, the equation a = 0 always implies that either ¢ or 
6 is zero, but this is not so in matrix multiplication. Thus, we have 


[halls a)-b els 


although neither 4 nor 2 is itselfa zero matrix, 
As another illustration, for scalars, the cquation cd = ce (with ¢ 4 0) implies that 
d =e. The same docs not hold for matrices, Thus, given 


cobs] etre] e [43] 


neop_| 5 8 


we find that 


even though ) 4 E. 

These strange results actually pertain only tv the special class of matrices known as 
singular matrices, of which the matrices 4, 8, and C are examples. (Roughly, these matri- 
ces contain a row which is a multiple of another row.) Nevertheless, such examples do 
reveal the pitfalls of unwarranted extension of algebraic theorems to matrix operations. 





EXERCISE 4.5 


9 
, _f-1 8 7], /2|- fu]. 
Given A= | 0-2 2) = [6]: sex [2) 


1. Calculate: (a) Af (BIA dx (ax 
Indicate the dimension of the identity matrix used in each case. 


Chapter 4 Linear Mudets and Matrix Algebra 73 


2, Calculate: (a) Ab (b) Alb (Qx'tA (a) x'A 
Does the insertion of / in (b) affect the result in (a)? Does the deletion of / in (d) affect 
the result in (c)? 
3. What is the dimension of the null matrix resulting from each of the following? 
(a) Premultiply A by a5 x 2 null matrix. 
(8) Postmultiply A by a 3 = 6 null matrix. 
(¢ Premultiply 6 by a 2 x 3 null matrix. 
{d) Postmuttiply x by a 1 x 5 null matrix. 
4. Show that the diagonal matrix 





O Doves Gay 


can be idempotent only if each diagonal element is either 1 or 0. How many different 
numerical idempotent diagonal matrices of dimension mx n can be constructed alto- 
gether from such a matrix? 


4.6 Transposes and Inverses 





Example 1 


Example 2 


When the rows and columns of a matrix 4 are interchanged.-- so that its first row becomes 
the first column, and vice versa—we obtain the transpose of A, which is denoted by 4’ or 
A’ The prime symbol is by no means new to us; it was used earlier to distinguish a row 
vector from a column vector. In the newly introduced terminology, a row vector x' consti- 
tutes the transpose of the column vector x. The superscript 7 in the alternative symbol is 
obviously shorthand for the word transpose. 


. 3.8 -9 304 
Given ak 0 4 and eam [1 +]: we can interchange the rows and 


columns and write 
301 
al a | and fy-[4 | 
G2 | 9 4 (2x2) 


By definition, if a matrix Ais mx n, then its transpase A’ must be nx m. Ann xn square 
matrix, however, possesses a transpose with the same dimension. 


9 1 104 
aa 0] andD=/9 3 7], then 
472 


104 
e=[ 4 2] and Di=|0 3 7? 
472 


Here, the dimension of each transpose is identical with that of the original matrix. 


74 Part TWO Static (or Equifibrium) Analysis 


Example 3_ 


Example 4 


In D’, we also note the remarkable result that D’ inherits not only the dimension of D 
but also the original array of clements! The fact that D' = D is the result of the symmetry 
of the elements with reference to the principal diagonal. Considering the principal diago- 
nal in D as a mirror, the clements located to its northcast are exact images of the clements 
to its southwest; hence the first row reads identically with the first column, and so forth. The 
matrix D exemplifies the special class of square matrices known as symmetric matrices. 
Another example of such a matrix is the identity matrix /, which, as a symmetric matrix, 
has the transpose 7’ = I. 


Properties of Transposes 
The following properties characterize transposes: 


(AY =A (4.9) 
(A+ By s A448 (4.10) 
(AB) = BA’ (4.11) 


The first says that the transpose of the transpose is the original matrix—a rather self- 
evident conclusion. 

The second property may be verbally stated thus: The transpose ofa sum is the sum of 
the transposes. 


wa[) | and 8 =| 2 then 


90 74 
._f 6 17 _[6 16 
cvey=[.§ nie 4 
1, _f4 9),[2 7]_fe 16 
and aaarnlt ol l= | 


The third property is that the transpose of a product is the product of the transposes iv 
reverse order. To appreciate the necessity for the reversed order, let us examine the dimen- 
sion conformability of the two products on the two sides of (4.11). [fwe let 4 bem x n and 
Bben x p, then 4B will be m x p, and (AB)’ will be p x m. For equality to hold, itis 
necessary that the right-hand expression B'A’ be of the identical dimension. Since 8’ is 
px nand 4’ ism xm, the product B’4’ is indeed p x m, as required. The dimension of 
B’ 4‘ thus works out. Note that, on the other hand, the product 4’B’ is nol even defined 
unless #7 = p. 


. 12 0 -1 
civen 4 =| | and a-([f 7 | we have 
,_ [12 137 _[12 24 
(aay=[74 3] [13 35] 
vf @ 6][1 3] _[12 24 
and saa| sl: al-[6 | 


This verifies the property. 


Chapter 4 Linear Models anc Matrix Algebra 75 


Inverses and Their Properties 

For a given matrix A, the transpose 4’ is always derivable. On the other hand, its inverse 
mmatrix—another type of “derived” matrix—may or may not exist. The inverse of matrix 4, 
denoted by A~', is defined only if A is a square matrix, in which case the inverse is the 
matrix that satisfies the condition 


AA'=A'4=7 (4.12) 


That is, whether A is pre- or postmultiplied by 47, the product will be the same identity 
matrix. This is another exception to the rule that matrix multiplication is not commutative. 
The following points are worth noting: 


1, Not every square matrix has an inverse—squareness is a necessary condition, but not a 
sufficient condition, for the existence of an inverse. Ifa square matrix A has an inverse, 
Ais said to be nonsingular; if A possesses no inverse, it is called a singuéar matrix. 

2. If 47' does exist, then the matrix 4 can be regarded as the inverse of 47, just as 47! 
is the inverse of A. In short, 4 and A~! are inverses of each other. 

3. lfAisa x n, then A7! mustalso be 2 x n; otherwise it cannot be conformable for both 
pre-and postmultiplication. The identity matrix produced by the multiplication will alsa 
ben xn. 

4. Ifan inverse exists, then it is unique. To prove its uniqueness, let us suppose that B has 
been found to be an inverse for A, so that 


AB=BA=T 


Now assume that there is another matrix C such that AC = CA = /. By premultiplying 
both sides of AB = / by C, we find that 


CAB=CH=C) [by (4.8)] 
Since C4 = / by assumption, the preceding equation is reducible to 
IB=C or B=C 


That is, 8 and C must be one and the same inverse matrix. For this reason, we can speak 
of the (as against an) inverse of A. 

5, The two parts of condition (4.12)—namely, 447! = / and 4-4 = /—actually imply 
each other, so that satisfying either equation is sufficient to establish the inverse rela- 
tionship between 4 and 47', To prove this, we should show that if 447' = /, and 
if there is a matrix B such that B4 = 7, then B = 4! (so that BA = / mustin effect be 
the equation 4714 = J). Let us postmultiply both sides of the given equation B4 = / 
by 47); then 


(BA)AT! = 1A" 
B(AA~') = 147! [associative law] 
Br=Id-! [AA |) = 7 by assumption] 


Therefore, as required, 
B=An" [by (4.8)] 


76 Part Two Static for Equilibrium) Analysis 


Example 5 


Analogously, it can be demonstrated that, if 4 '4 = J, then the only matrix C’ which 
yields C47! = FisC =A. 


301 1y2 -1). , ae (Ly 4 
wet a=] >| and = 21 3] then since the salar mulepir () in 8 canbe 


moved to the rear (commutative law), we can write 


agal3 1]f2 -1]1_[6 o11_y1 o 
“lo 2}[0 3]6 [0 6j6 [0 1 
This establishes Bas the inverse of A, and vice versa. The reverse multiplication, as expected, 
also yields the same identity matrix: 


elo al[o al-alo s]-Lo 4] 


The following three properties of inverse matrices are of interest. If A and 8 are nonsin- 
gular matrices with dimension » x n, then 


(Al aA (4.13) 
(4B) = B47! (4.14) 
(ay '=c4 (4.15) 


The first says thal the inverse of an inverse is the original matrix, The s¢cond states that 
the inverse of a product is the product of the inverses in reverse order And the last one 
means that the inverse of the transpose is the transpose of the inverse. Note that m these 
statements the existence of the inverses and the salis(action of the conformability condition 
are presupposed. 

The validity of (4.13) is fairly obvious, but let us prove (4.14) and (4.15). Given the 
product 4 B, let us find its inverse—call it C. From (4.12) we know that CAB = 1, thus, 
postmultiplication of both sides by B 4 ! will yield 





CABB'A'=18 '4 '(=BTA') (4.16) 
But the left side is reducible to 


CA(BB A 1 = Cala [by (4.12)] 
=CAA=CI=C [by (4.12) and (4.8)] 


Substitution of this into (4.16) then tells us that C = B-!A~! or, in other words, that the 
inverse of 48 is equal to B '47', as alleged. In this proof, the equation A 47! = 
A~'A=EJ was utilized twice. Note that the application of this cquation is permissible if 
and only if'a matrix and its inverse are strictly adjacent to cach other in a product. We may 
write 447'B = 18 = B, but never ABA™' = B. 

The proof of (4.15) is as follows. Given 4’, let us find its inverse—call it 2. By defini- 
tion, we then have DA’ = J. But we know that 


(AA Sf ala] 


Chapter 4 Liaear Models and Matrix Algebra 77 


produces the same identity matrix. Thus we may write 





(Arya [by 4.119] 

Postmultiplying beth sides by (4')~!, we obtain 
Daisy's (a7)! 

or D=(Aly [by (4.12)] 


A(AY! 


Thus, the inverse of 4’ is equal to (4 ')', as alleged. 

In the proofs just presented, mathematical operations were performed on whole blocks 
of numbers. If those blocks of numbers had not been treated as mathematical entities (ma- 
trices), the same operations would have been much more lengthy and involved. The beauty 
of matrix algebra lics preciscly in its simplification of such operations. 


Inverse Matrix and Solution of Linear-Equation System 

The application of the concept of inverse matrix to the solution of a simultaneous-cquation 
system is immediate and direct. Referring to the equation system in (4.3), we pointed out 
earlier that it can be written in matrix notation as 


=d (4.17) 


Ao += 
x3) Gx Gxt) 
where A, x, and d are as defined in (4.4). Now ifthe inverse matrix A~! exists, the premul- 
tiplication of both sides of the cquation (4.17) by 47! will yield 


AtAr=Ad 


d (4.18) 


) (31) 


or 





The left side of (4.18) is a column vector of variables, whercas the right-hand product is a 
column vector of certain known numbers. Thus, by definition of the equality of matrices or 
vectars, (4.18) shows the set of values of the variables that satisfy the cquation system, i.e., 
the solution values, Furthermore, since 47! is unique if it exists, A~'d must be a unique 
vector of solution values. We shall therefore write the x vector in (4.18) as x*, to indicate 
its status as a (unique) solution, 

Methods of testing the existence of the inverse and of its calculation will be discussed in 
Chap. 5. It may be stated here, however, that the inverse of the matrix 4 in (4,4) is 


1 18 -16 —10 


At==|/-13 % 13 
217 igo 
Thus (4.18) will turn out to be 
al } | 18 —16 -10] | 22 2 
ae =y/-H % By] i2|=|3 
3 -17 18 ajLio 1 


which gives the solution: xf = 2, xj = 3, and xj = 1. 


78 Part Two Svatic (ar Equilibrium) Analysis 


The upshot is that, as one way of finding the solution of a linear-equation system 
Ax =d, where the coefficient matrix A is nonsingular, is to first find the inverse 47!, and 
then postmultiply 47! by the constant vector d. The product A” 'd will then give the solu- 
tion values of the variables. 


Example 6 As shown in Example 11 of Sec. 4.2, the simple national-income model 
Y=C # lg + Go 
C=a-by 
can be written in matrix notation as Ax = d, where 
_f 1-1 _ly _| f+ Go 
a=| 1] “=[C] and a=| a 


The inverse of matrix 4 is (see explanation in Sec. 5.6) 


tp 
at 
4 =a i] 


Thus the solution of the model is x* = A 'd, or 


yy] 1 ft Wf to~Go 1 Int+Go+a 
ct] 1-b[6b 1 a —b| bUo+ Go) + a 











EXERCISE 4.6 


A 04 3-8 10 9), ope 7 
1 Given A= [_{ 3}-8-[¢ rand c=| r 1 | nd #, 8, and c 


2, Use the matrices given in Prob. 1 to verify that 
() (APF BY=A'+B (A) (ACY = CA 

3. Generalize the result (4.11) to the case of a product of three matrices by proving that, 
for any conformable matrices A, B, and C, the equation (ABC)’ = C'B'A’ holds. 

4. Given the following four matrices, test whether any one of thern is the inverse of 
another: 


1 
112 fia _f1 -4 _| 4 72 
fh) etal ltl @ [4] 


5. Generalize the result (4.14) by proving that, for any conformable nensingular matrices 
A, 8, and C, the equation (ABC) 1 = C-'B-1 A"! holds, 
6, Let A= 1 —X(X'XY TX. 
(a) Must A be square? Must (X'X) be square? Must X be square? 
(b) Show that matrix A is idempotent. [Note: If X' and X are not square, it is inappro- 
priate to apply (4.14).] 


4.7. Finite Markov Chains 


Acommon application of matrix algebra is found in what is known as Markov processes or 
Markov chains. Markov processes arc uscd 10 measure or estimate movements over timc. 
This involves the use of a Markov transilion matrix, where cach value in the transition 





Chapter 4 Linear Madels and Matrix Algebra 79 


matrix is a probability of moving from one state (location, job, etc.} to another state. There 
is also a vector containing the initial distribution across the various states. By repeatedly 
multiplying such a vector by the transition matrix, one can estimate changes across states 
over time. 

Consider the problem of internal cmployee movement within a company that has many 
different branches, or outlets.’ A simple illustration using two branches, such as Abbotsford 
and Burnaby, will help to demonstrate the basics of a Markov process. To determine the 
number of employees in Abbotsford tomortow, we take the probability that the employees 
will stay in the Abbotsford branch multiplied by the total number of employees currently in 
Abbotsford, which gives the total number of current Abbotsford employees who will 
remain tomorrow, Added to this number is the number of Burnaby employces transferring 
to Abbotsford, This number is found by multiplying the total number of current Burnaby 
employces by the probability of a Burnaby employee transferring to Abbotsford. Similarly 
the process would be the same for determining the number of employces in the Burnaby 
region tomorrow, made up of those Burnaby employees who chose to remain and the 
Abbotsford employees who transfer into the Burnaby region today. The process described 
involves four probabilities, These four probabilities together can be arranged in a matrix. 
This is known as a Markov transition matrix (or simply, a “Markov”). 

Let 4, and B, denote the populations of Abbotsford and Burnaby, respectively, at some 
time, #. Further, define the transitional probabilities as follows 


P44 = probability that a current 4 remains an 4 
P41 = probability that a current 4 moves to B 
Py = probability that a current B remains a 8 
Px. = probability that a current & moves to 4. 


If we denote the distribution of employees across locations at time tas a vector 


and the transitional probabilities in matrix form 


Pai Pap 
M=|,” 
[ Hal 


then the distribution of employees across locations in the next period (¢ + 1) is 


x M = x, 
(1x2) (2x21 1x2} 
Pas P, 
[4 Bil] o*4 2? | =A Pra + BePaa) (Ay Pas + By Pos) 
Pas Pap 


=[441 Bi] 


* We would like to thank Sarah Dunn for this example. This work comes from her final project while a 
student at the British Columbia Institute of Technology, Burnaby, BC, Canada (June 2003). 


80 Part Two 


Example 1 


Static (or Equitibrium) Analysis 


To find the distribution of employees after two periods 


Pas Pap | _ 
[Ana aia pi Pap =[4y2 B42 


Pas Pap |} Pas Pas 
= B, 
i4,  B] [es | [Pi | [Area +2] 


2 
Pas Par) _ 
{4, Bi] [Fa Pag i ~ [442 Bisa] 


In general, for x periods 


Pat Pan] 
[A, eal pa Pan =[Arsn Bisa] 


The 2 x 2 probability matrix M is known as the Markov transition matrix. For the case 
where 1 is exogenous, the process is known as a finite Markov chain. 


Suppose the initial distribution of employees across the two locations at time f = 0 is 


x= [Ag Bo] =[100 100] 


In other words, there are initially equal numbers at each location. Further, let the transi- 
tional probabilities in matrix form be as follows: 


Ma Pas Pas | _ 0,7 0.3 
“1 Pea Peel [04 0.6 
Then the distribution of employees across locations in the next period (t = 1) is 


0.7 03 


[100 v00i] 64 06 


| =[110 90]=[A1 Bi] 
The distribution after two periods is given by 


07 03 


0.61 0.39 
04 06 


2 
| = [100 co 048 
=[113 87) =[A2 82] 


[100 100i 


The distribution after 10 periods (¢ = 10) is given by 
07 037 _ 0.5174 0.4286 
[100 v001/ 0% 08 = [700 vool| S74 078 | 


=(1143 857)=[Aio Bl 


Notice what happens when the Markov transition matrix is raised to higher and higher pow- 
ers. The new transition matrix found by raising the original matrix to increasingly higher 
powers converges to a matrix where the rows are identical. This is referred to as the steady 
state, What would you expect the eleventh or higher periods of distribution to look like? 


Chapter 4 Linear Madets and Mutrix Algebra 81 


Special Case: Absorbing Markov Chains 
Now, let us extend the model by adding a third option: Employees can exit the company, 
with 

Paz = probability that a current A chooses to exit (£) 

Pye = probability that a current B chooses to exit (E) 

At this point, we will add the following assumptions: 
Peg =O Pep =O Pee =| 

where Peg, Pry,and Pyy are the probabilities that an employee who is currently an # will 
goto 4, B, or E, respectively. In other words, nobody who leaves the company ever returns. 
Itis also implied by these restrictions that our company never replaces employees that leave 


(there are no new hires). 
Starting at time ¢ = 0, our Markov chain now becomes 


8 


Pas Pap Pag 

[4p Bo Eol] Pes Poe Poe | =[4n Ba En] 
Pra Pea Pre 
Pea Pap Pac |" 

[4o Bo Ent] Pag Psa Por | =l4a Be En] 
0 0 1 


{Assume Ey = 0.) 

This type of Markoy process is referred to as an absorbing Markov chain. Because of the 
values of the transition probabilities found in the third row, we sec that once an employee 
becomes an # in one state (time period) that employee will remain there for all future states 
(time periods). As # goes to infinity, 4, and B, will approach zero and E,, will approach 
the value of the total number of workers at time zero (i.e., 4y + Bo + En) 











EXERCISE 4.7 


1. Consider the situation of a mass layoff (i.e., a factory shuts down) where 1,200 people 
become unemptoyed and now begin a job search. In this case there are two states: 
employed (£) and unemployed (U) with an initial vector 


=F Ul=l0 1,200] 


Suppose that in any given period an unemployed person will find a jab with probabil- 
ity .7 and will therefore remain unemployed with a probability of .3. Additionally, 
persons who find themselves employed in any given period may lose their job with a 
probability of .1 (and will have a .9 probability of remaining employed). 


(a) Set up the Markov transition matrix fer this problem. 

(b) What will be the number of unemployed people after (i) 2 periods; Gi) 3 periods; 
(iti) 5 periods; (iv) 10 periods? 

(c} What is the steady-state fevel of unemployment? 


Chapter 





Linear Models and Matrix 
Algebra (Continued) 


In Chap. 4, it was shown that a linear-equation system, however large, may be written in a 
compact matrix notation. Furthermore, such an equation system can be solved by finding 
the inverse of the coefficient matrix, provided the inverse exists. Now we must address our- 
selves to the questions of how to test for the existence of (he inverse and how to find that 
inverse. Only after we have answered these questions will it be possible to apply matrix 
algebra meaningfully to economic models. 


5.1_Conditions for Nonsingularity of a Matrix 








Example 1 


82 


A given coefficient matrix 4 can have an inverse (i.c., can be “nonsingular”) only if it is 
square. As was pointed out earlier, however, the squareness condition is necessary but not 
sufficient for the existcnee of the inverse 47’. A matrix can be square, but singular (with- 
out an inverse) nonetheless. 


Necessary versus Sufficient Conditions 
The concepts of “necessary condition” and “sufficient condition” are used frequently in 
economics. It is important that we understand their precise meanings before proceeding 
further. 

A necessary condition is in the nature of a prerequisite: Suppose that a statement p is 
true only if another statement g is true; then g constitutes a necessary condition of p. Sym- 
bolically, we express this as follows: 


pg 6.1) 


which is read as “p only if q,” or alternatively, “ip, then g.” Jt is also logically correct to 
interpret (5.1) to mean “p émplies g.” Tt may happen, of course, that we also have p = w 
at the same time. Then both g and w are necessary conditions for p. 


If we let p be the statement “a person is a father” and qbe the statement “a person is male,” 
then the logical statement p = q applies. A person is a father only if he is male, and to be 
male is a necessary condition for fatherhood. Note, however, that the converse is not true: 
fatherhood is not a necessary condition for maleness. 


Example 2 


Example 3 


Chapter 5 Linear Models and Matrix Algebra (Continued) 83 


A different type of situation is one in which a statement p is true if g is (rue, but p can 
also be true when g is not truc. In this case, g is said to be a sufficient condition for p. The 
truth of g suffices to establish the truth of p, but it is not a necessary condition for p. This 
case is expressed symbolically by 


pod (5.2) 


which is read: “p ig” (without the word onfy)—or alternatively, “if g, then p,” as if read- 
ing (5.2) backward. It can also be interpreted to mean “g implies p.” 


If we let p be the statement “one can get to Eurape” and q be the statement “one takes a 
plane to Europe,” then p < q. Flying can serve to get one to Eurape, but since ocean trans- 
portation is also feasible, flying is not a prerequisite. We can write p & q, but not p + q. 


In a third possible situation, q is both necessary and sufficient for p. In such an event. 
we write 


peg (5.3) 


which is read: “p ifand only if g” (also written as “p iff g”). The double-headed arrow is 
teally a combination of the two types of arrow in (5.1) and (5.2), hence the joint use of the 
two terms “if” and “only if” Note that (5.3) slates not only that p implics q but also that ¢ 
implies p, 


If we let p be the statement “there are less than 30 days in the month” and q be the state- 
ment “it is the month of February,” then p + q. To have less than 30 days in the month, it 
is necessary that it be February. Conversely, the specification of February is sufficient to es- 
tablish that there are less than 30 days in the month. Thus q is a necessary-and-sufficient 
condition for p. 

In order to prove p = q, it needs to be shown that g follows logically from p. Similarly, 
to prove p = g requires a demonstration that p follows logically from g, But to prove p & q 
necessitates a demonstration that p and q follow from each other. 


Necessary conditions and sufficient conditions are important as screening devices. Con- 
sider a pool of applicants being considered for scholarship awards, or for job positions. 
Since necessary conditions are in the nature of prerequisites, they serve to separate the can- 
didates into two groups: Those who fail to meet the necessary conditions are automatically 
disqualified; those who satisfy the necessary conditions remain as admissible candidates. 
To remain as an admissible candidate, however, carries no guarantee that the candidate will 
eventually be successful. Thus, necessary conditions are more conclusive in screening out 
the unsuccessful candidares than in identifying the successful ones. In general, we should 
bear in mind that necessary conditions are not in themselves sufficient. 

In contrast to necessary conditions, sulficient condilions serve directly to identify suc- 
1] candidates. A candidate that satisfies a sufficient condition is automatically a 
ful one. Just as necessary conditions are not in themselves sufficient, sufficient con- 
ditions are not in themselves necessary. This is because, along with any given sufficient 











84 Part Two Static for Equilibrium) Analvsis 


Example 4 


condition, there may cxist other, Less stringent, sufficient conditions, and the candidate who 
fails to satisfy the given sufficient condition may yet qualify under an easier sufficient con- 
dition. For example, a grade of A is sufficient for passing a course, but it is not a necessary 
condition since a grade of B is also sufficient. 

The most effective screening device is found in the necessary-and-sufficient conditions. 
Failure to satisfy such a condition means the candidate is definitely out, and satisfaction of 
such a condition means the candidate is definitely in. We can find an immediate application 
of this in our present discussion of nonsingularity of a matrix. 


Conditions for Nonsingularity 
After the squareness condition (a necessary condition) is already met, a sufficient condition 
for the nonsingularity of a matrix is that its rows be linearly independent (or, what amounts 
to the same thing, that its colmms be linearly independent). When the dual conditions 
of squareness and linear independence are taken together, they constitute the necessary- 
and-sufficient condition for nonsingularity (nonsingularity < squareness and lincar 
independence). 

Anr x # coefficient matrix A can be considered as an ordered set of row vectors, i 
a column vector whose clements are themselves row vectors: 





vy 
ap a2 0 An ! 
Awl Ol Br ae vy 
By) Ayr Can vy 
where wv =[@1 a2 in], = 1, 2,..., 7. For the rows (row vectors) to be lin- 


early independent, none must be a linear combination of the rest. More formally, as was 
mentioned in Sec. 4.3, linear row independence requires that the only set of scalars 4; 
which can satisfy the vector equation 


Rn 


Yoh = 0 (5.4) 





be k; = 0 for all i. 


If the coefficient matrix is 


34 5] [4 
A=|0 1 2/=|% 
680] ly 


then, since [6 8 10]=2[3 4 5], we have v, = 2v, = 2v; + Ov;. Thus the third row is 
expressible as a linear combination of the first two, and the rows are not linearly indepen- 
dent. Alternatively, we may write the previous equation as 


avi +0v,;-v4=[6 8 10]+[0 O O]-[6 8 10])=[0 0 0] 


Inasmuch as the set of scalars that led to the zero vector of (5.4) is not k; = 0 for all j, it 
follows that the rows are linearly dependent. 


Chapter 5 Linear Models and Matrix Algebra (Continued) 85 


Unlike the squareness condition, the linear-independence condition cannot normally be 
ascertained a1 a glance. Thus a method of testing linear independence among rows (or 
columns) needs to be developed. Before we concern ourselves with that task, however, it 
would strengthen our motivation first to have an intuitive understanding of why the linear- 
independence condition is heaped together with the squareness condition at all. From the 
discussion of counting equations and unknowns in Scc, 3.4, we recall the general conclu- 
sion that, for a system of equations 1o possess a unique solution, it is not sufficient to have 
the same number of equations as unknowns. In addition, the equations must be consistent 
with and functionally independent (meaning, in the present context of linear systems, 
Hinearly independent) of one another. There is a fairly obvious tie-in between the “same 
number of equations as unknowns” criterion and the sguareness (same number of rows and 
columns) of the coefficient matrix. What the “linear independence among the rows” 
requirement does is to preclude the inconsistency and the linear dependence among the 
equations as well, Taken together, therefore, the dual requirement of squareness and row 
independence in the coefficient matrix is tantamount to the conditions for the existence of 
a unique solution enunciated in Sec, 3.4. 

Let us illustrate how the linear dependence among the rows of the coefficient matrix can 
cause inconsistency or lincar dependence among the equations themselves. Let the equa- 
tion system Ax = d take the form 


[PoE] -[4] 


where the coeflicient matrix A contains linearly dependent rows: vj = 2v3. (Note that 
its columns are also dependent, the first being 3 of the second.) We have not specified 
the values of the constant terms d, and d2, but there are only fwo distinct possibilities 
tegarding their relative values: (1) d, = 2d) and (2) @ # 2d2. Under the first—with, say, 
d, = 12 and ds = 6. - the two equations are consistent but Jinearly dependeni (just as the 
two rows of matrix 4 arc), for the first cquation is merely the second equation times 2. One 
equation is then redundant, and the system reduces in effect to a single equation, 
5xq + 2x2 = 6, with an infinite number of solutions. For the second possibility—with, say, 
d, = 12 but d; = 0—the two equations arc inconsistent, because if the first equation 
(10x; + 4x2 = 12) is true, then, by halving each term, we can deduce that Sx; + 2x2 = 6: 
consequently the second equation (5x, + 2x72. = 0} cannot possibly be truc also. Thus no 
solution exists. 

The upshot is that no unique solution will be available (under either possibility) so long 
as the rows in the coefficient matrix 4 are linearly dependent. In fact, the only way to have 
a unique solution is to have linearly independent rows for columns) in the coefficient 
matrix. In that case, matrix 4 will be nonsingular, which means that the inverse 4 ! docs 
exist, and that a unique solution x* = A7'¢ can be found. 


Rank of a Matrix 


Even though the concept of row independence has been discussed only with regard to square 
matrices, it is equally applicable to any m x n rectangular matrix. If the maximum number 
of lincarly independent rows that can be found in such a matrix is r, the matrix is said to be 
of vank r. (The rank also tells us the maximum number of linearly independent columns in 
the said matrix.} The rank of an #1 x ” matrix can be at most # or 2, whichever is smaller. 








86 Part Two Static fur Equilibrium) Analysis 


Example 5 


Given a matrix with only two rows (or two columns), row independence (or column 
independence) is easily verified by visual inspection—one only has to check whether one 
row (column) is the exact multiple of the other. But for a matrix of larger dimension, visual 
inspection may not be feasible, and a more formal method is needed. One method for find- 
ing the rank of a matrix A (not necessarily square), i.e., for determining the number of 
independent rows in 4, involves transforming A into a so-called echelon matrix by using 
certain “elementary row operations.” A particular structural feature of the echelon matrix 
will then tell us the rank of matrix 4. 

There are only three types of elementary row operations on a matrix:” 


1. Interchange of any two rows in the matrix. 

2. Multiplication (or division) of a row by any scalar 4 # 0. 

3. Addition of “é times any row” to another row. 

While cach of these operations converts a given matrix A into a different form, none of 
them alters the rank. It is this characteristic of elementary row operations that cnables us 


to read the rank of 4 from its echelon matrix. The easiest way to explain the method of 
echclon matrix is by a specific example. 


Find the rank of the matrix 


Oo -11 -4 
A=|2 6 2 
4 1 0 


from its echelan form. First, we check the first column of A for the presence of zero ele- 
ments. If there are zero elements in column 1, we move those zero elements to the bottom 
of the matrix. In the case of A, we want to move the 0 (first element of column 1) to the 
bottom of that column, which can be accomplished by interchanging row 1 and row 3 
(using the first elementary row operation). The result is 


fa 1 a] 
A,=|2 6 2 
Lo -1 -4] 


Our next objective is to reshape the first column of Aj into a unit vector e as defined 
in (4.7). To transform the element 4 into unity, we divide row 1 of Ay by the scalar 4 
(applying the second elementary row operation), which yields 


rd 8] 
Az=|2 6 2 
Lo -11 -4 } 


Then, to transform the element 2 in column 1 of Az into 0, we multiply row 1 of Az by - 2, 
and then add the result to row 2 of Az (applying the third elementary row operation). The 
resulting matrix, 








ro} 0} 
A= yO 54 
Qo -11 -4 


t Similarly to elementary row operations, there can be defined elementary column operations. For our 
purposes, row operations are sufficient. 


Chapter 5 Linear Models and Matrix Algebra (Continued) 87 


now has the desired unit vector e; as its first column. Having achieved this, we now exclude 
the first row of A3 from further consideration, and continue to work only on the remaining 
two rows, where we want to create a two-element unit vector in the secand column—by 
transforming the element 5h into 1, and the element —11 into 0. To this end, we need to 
divide row 2 of A; by 5h, thereby changing the row into the vector [0 1 aL and then 
add 11 times this vector to row 3 of A3. The end result, in the form of 


i 0 
14 
0.0 

exemplifies the echelon matrix, which, by definition, possesses three structural features. 
First, nanzero rows (rows with at least one nonzero element) appear above the zero rows 
{rows that contain only 0s). Second, in every nonzero row, the first nonzero element is 
unity. Third, the unit element (the first nonzero element) in any row must appear to the left 
of the counterpart unit element of the immediately following row. It should be clear by now 
that all the elementary row operations we have undertaken are designed to produce these 
features in Aa. 

Now, we can simply read the rank of A from the number of nonzero rows present in the 
echelon matrix Aq. Since Aq contains two nonzero rows, we can conclude that r(A) = 2. 
This is, of course, also the rank of matrices A; through Aq, because elementary row opera- 
tions do not alter the rank of a matrix. 


The method of echeton matrix transformation applies to nonsquare as well as square 
matrices, We have chosen a square matrix for Example 5 because our immediate objective 
is to address the question of nonsingularity, which pertains only to square matrices. By 
definition, for an @ x # matrix 4 to be nonsingular, it must have 7 linearly independent 
rows (or columns); consequently, it must be of rank #, and its echelon matrix must contain 
exactly # nonzero rows, with no zero rows at all, Conversely, ann x 2 matrix having rank 
must be nonsingular. Thus an x x » echelon matrix with no zero rows must be nonsingu- 
lar, as is the matrix from which the echelon matrix is derived via clemeniary row opcra- 
tions. In Example 5, the matrix 4 is 3 x 3, but s(4) = 2; hence. 4 is not nonsingular. 





EXERCISE 5.1 
1. In the following paired statements, iet p be the first statement and q the second. 
Indicate for each case whether (5.1), (5.2), or (5.3) applies, 
(a) It is a holiday; it is Thanksgiving Day. 
(b) A geometric figure has four sides; it is a rectangle. 
(Q Two ordered pairs (a, 6) and {5, a) aré equal; a is equal to b. 
(d) A number is rational; it can be expressed as a ratio of two integers. 
(e) A4 x 4 matrix is nonsingular; the rank of the 4 x 4 matrix is 4. 
(f) The gasoline tank in my car is empty; | cannot start my car, 


(g) The letter is returned to the sender with the marking “addressee unknown”; the 
sender wrote the wrong address on the envelope. 


88 Part Two Static for Equilibrium) Analysis 


2, Let p be the statement “a geometric figure is a square,” and let q be as follows: 
(a) it has four sides. 
(®) It has four equal sides. 
(9 It has four equal sides each perpendicular to the adjacent one. 
Which is true for each case: p = g, p += q, or p + q? 
. Are the rows linearly independent in each of the following? 


2448 290 04 -1 5 
| 9 5] Ol 2| ols >| | 2 13] 
4, Check whether the columns of each matrix in Prob, 3 are also linearly independent. Do 
you get the same answer as for row independence? 


5, Find the rank of each of the following matrices from its echelon matrix, and comment 
on the question of nonsingularity. 


tat 76 
@A=) 039 (QC=/0 1 
-100 80 


0-1 4 279 -1 
()B=|3 1 2 ()D=]4 1 1 
6 1 0 059 -3 


6. By definition of linear dependence among rows of a matrix, one or more rows can be 
expressed as a linear combination of some other rows. in the echelon matrix, tmear 
dependence is signified by the presence of one or more zero rows, What provides the 
fink between the presence of a linear combination of rows in a given matrix and the 
presence of zero rows in the echelon matrix? 


w 


wow 


5,2 _Test of Nonsingularity by Use of Determinant 





To ascertain whether a square matrix is nonsingular, we can also make use of the concept 
of determinant. 


Determinants and Nonsingularity 


The determinant of a square matrix A, denoted by | 4], is a uniquely defined scalar (num- 
ber) associated with that matrix. Determinants are defined only for square matrices. The 
smallest possible matrix is, of course, the | x 1 matrix 4 = [a),]. By definition, its deter- 
minant is equal to the single element aj, itself: | 4] = |ay)/ = a1). The symbol {a,)| here 
must not be confused with the look-alike symbol for the absolute value of a number. In the 
absolute-value context, we have, for instance, not only [5] = 5, but also | — 5| = 5, because 
the absolute value of a number is its numerical value without regard to the algebraic sign. 
In contrast, the determinant symbol preserves the sign of the element, so while [8 =8 
(a positive number), we have | — 8| = —8 {a negative number). This distinction proves to 
be crucial in the later discussion when we apply determinantal tests whose results depend 
critically on the signs of determinants of various dimensions, including | x 1 ones, such as 
lau|=an. 


Example t 


Chapter 5 Linear Models and Matrix Algebra (Continted) 89 


. ay a 
For a 2 x 2 matrix 4 = Wap 
i] Rv 


| its determinant is defined to be the sum of’ two 
terms as follows: 


ap ay 


lal = 
a3, ayy 


= Mh) 1@22 ~ Aa {=a scalar} (5.5) 








which is obtained by multiplying the two elements in the principal diagonal of A and then 
subtracting the product of the two remaining elements. Ln view of the dimension of matrix 
A, the determinant |4| given in (5.5) is called a second-order determinant. 


_f10 4 _{3 8 . . 
Given A= [ 8 3| and B= (3 J |- their determinants are 


ai=|"5 $ |= 1005) -a¢s) = 18 
and i8i= |p j]3CD-08)=-3 


While a determinant (enclosed by two vertical bars rather than brackets) is by definition a 
scalar, a matrix as such does not have a numerical value. In other words, a determinant is 
reducible to a number, but a matrix is, in contrast, a whole block of numbers. It should also 
be emphasized that a determinant is defined only for a square matrix, whereas a matrix as 
such does not have te be square. 


Even at this early stage of discussion, it is possible to have an inkling of the relationship 
between the linear dependence of the rows in a matrix 4, on the one hand, and its determi- 
nant | A], on the other. The two matrices 


[EG s) oo [0)-B al 


both have linearly dependent rows, because c) = c5 and dy = 4d. Both of their determi- 
nants also turn out to be equal to zero: 





Ic|= ; 3) = 3 =0 
1 
\D| = ; 24] = 22-80 =0 





This result strongly suggests that a “vanishing” determinant (a zero-value determinant) 
may have something to do with linear dependence. We shall see that this is indeed the case. 
Furthermore, the value of a determinant |.4j can serve not only as a criterion for testing the 
linear independence of the rows (hence the nonsingularity) of matrix 4, but also as an input 
in the calculation of the inverse A~!, if it exists. 

First, however, we must widen our vista by a discussion of higher-order determinants, 


Evaluating a Third-Order Determinant 
A determinant of order 3 is associated with a3 x 3 matrix, Given 

G1 ayy ay3 

A=] an an ay 

3, 32 33 


90 Part TWo Static for Equilibrium) Analysis 


FIGURE 5.1 


Example 2 


Example 3 











its determinant has the value 


a a2 a3 


























An Ar ay, 433 ay ax 
|Al = Jan. zz 23 | = a1 - a3 
432 G43 43) 433 ay ay 
a3, 432433 
= ay dy70833 — Gy 1433432 F Ar2d2343| — Gy2d21433 
+ ay3d21432 — 41342283) [=a scalar] (3.6) 


Looking first at the lower line of (5.6), we sec the value of |.4| expressed as a sum of six 
product terms, three of which are prefixed by minus signs and three by plus signs. 
Complicated as this sum may appear, there is nonetheless a very easy way of “catching” 
all these six terms from a given third-order determinant. This is best explained diagram- 
matically (Fig, 5.1). In the determinant shown in Fig. 3.1, each element in the top row 
has been linked with two other elements via two solid arrows as follows: a1, > @2. > 233, 
Gi. > a5 > ayy, and d|3 > @37 > ayy. Each triplet of elements so linked can be multi- 
plied out, and their product taken as one of the six product terms in (5.6). The solid-arrow 
product terms are to be prefixed with plus signs. 

On the other hand, each top-row element has also been connected with two other clc- 
ments via two broken arrows as follows: @1; > dy > @23, @i2 > a2) > 33, and 
ay3 > d) > a3). Each triplet of elements so connected can also be multiplied out, and 
their product taken as one of the six terms in (5.6). Such products are prefixed by minus 
signs. The sum of all the six products will then be the valuc of the determinant. 








213 
4 5 6] = (2C5){9) + (1)(6)(7) + (3N(8)CA) — (2)(B)(E) — AN) — BS) = —9 
789 
7 0 3 
9 1 4) = (-7IMS) + OAV) + (B)ENI) — (—7)(GN(4) — (OMS) — (391900) 
065 





= 295 


Example 4 


Chapter 5 Linear Models and Matrix Algebra (Continued) 91 


This method of cross-diagonal multiplication provides a handy way of evaluating a third- 
order determinant, but unfortunately it is nut applicable to determinants of orders higher 
than 3. For the latter, we must resort to the so-called Laplace expansion of the determinant. 


Evaluating an nth-Order Determinant by Laplace Expansion 
Let us first explain the Laplace-expansion process for a third-order determinant. Returning 
to the first line of (5.6), we scc that the value of |4| can also be regarded as a sum of three 
terms, each of which is a product of a first-row element and a particular second-order 
determinant. This latter process of evaluating |4|—by means of certain lower-order 
determinants—illustrates the Laplace expansion of the determinant, 

The three second-order determinants in (5.6) are not arbitrarily determined, but are 


ay 


specified by means of a definite rule, The first one, a 223 , is asubdeterminant of |.4| 
3: 


33 

obtained by deleting the first row and frst column of |A|. This is called the minor of the 
clement a), (the element at the intersection of the deleted row and column) and is denoted 
by |441\|, In general, the symbol |.44;;| can be used to represent the minor obtained by delet- 
ing the ith row and jth column of a given determinant. Since a minor is ilself'a determinant, 
it has a valuc. As the reader can verify, the other two second-order determinants in (5.6) are, 
respectively, the minors [Myo] and [443); that is, 





423 a2, 423 ay 22 


— | 42 
My] = 
3, 32 


Myl= 
ae ay IMial 


Mil = 

















@) 433 

A concept closely related to the minor is that of the cofactor. A cofactor, denoted by 
{Ci;|, is a minor with a prescribed algebraic sign attached to it. The rule of sign is as fol- 
lows. If the sum of the two subscripts i and j in the minor |Mj;| is even, then the cofactor 
takes the same sign as the minor; thal is, |C;;1 = |Mj;|. If it is odd, then the cofactor takes 
the opposite sign to the minor; that is, |C;;| = —|Mj;|. In short, we have 


ley] = (1 My 


where it is obvious that the expression (—1)'*/ can be positive if and only if (i + /) is even, 
The fact that a cofactor has a specific sign is of extreme importance and should always be 
borne in mind, 


987 
In the determinant |6 5 4], the minor of the element 8 is 
3°21 
64 
imal =|3 t|=-6 
but the cofactor of the same element is 
[Cal =—|Mial = 6 
because / + {= 1+2 =3 is odd. Similarly, the cofactor of the element 4 is 
9 8 
|Cza| = —| Maal = -|3 3|=6 


+ Many writers use the symbols M,, and C, (without the vertical bars) for minors and cofactors. We 
add the vertical bars to give visual emphasis to the fact that minors and cofactors are in the nature of 
determinants and, as such, have scalar values. 


92 Part Two Static (or Equilibrium) Analysis 


Example 5 


Using these new concepts, we can express a third-order determinant as 
[A] = ay [Ati | — a2) Miz] + a13|Mial 
3 
=anlCy| +eaniCpl +a3lCal = Varley (3.7) 
= 
ie.,as a sum of three terms, each of which is the product of a first-row element and its cor- 
responding cofactor. Note the difference in the signs of the ay2|Miz| and @i2|Ci2] terms in 
(5.7). This is because 1 + 2 gives an odd number. 

The Laplace expansion of a third-order determinant serves to reduce the evaluation 
problem to one of evaluating only certain second-order determinants, A similar reduction 
is achieved in the Laplace expansion of higher-order determinants. In a fourth-order deter- 
minant |B], for instance, the top row will contain four elements, 4)... 4,4; thus, in the 
spirit of (5.7), we may write 


4 
[Bl = Salih 
jal 


whore the cofactors |C),| are of order 3. Bach third-order cofactor can then be evaluated as 
in (5.6). In general, the Laplace expansion of an nth-order determinant will reduce the 
problem to one of evaluating # cofactors, each of which is of the (n — 1)st order, and the 
repeated application of the process will methodically lead to lower and lower orders of 
determinants, eventually culminating in the basic second-order determinants as defined 
in (5.5), Then the value of the original determinant can be easily calculated. 

Although the process of Laplace expansion has been couched in terms of the cofactors 
of the first-row clements, it is also feasible to expand a determinant by the cofactor of any 
row or, for that matter, of any column. For instance, if the first column of a third-order 
determinant | 4{ consists of the elements @,,, a,,, and @;,, expansion by the cofactors of 
these elements will also yield the value of | 4]: 


3 
|Al = aulCuil + an lCrl + @lCul =} ania 


inl 





5 61 
Given |A| = 5 3 0], expansion by the first row produces the result 
7 -3 90 
390 20 2 3 
ai=3] 3 0|-6|3 al+ |; _3 |= 040-27 = -27 
But expansion by the first column yields the identical answer: 
3.9 67 
[A] = 3 al |+7|5 p|=0-6-21=-27 





Insofar as numerical calculation is concerned, this fact affords us an opportunity to 
choose some “casy” row or column for expansion. A row or column with the largest num- 
ber of 0s or Is is always preferable for this purpose, because a 0 times its cofactor is simply 
0, so that the term will drop out, and a | times its cofactor is simply the cofactor itself, so 


Chapter 5 Linear Models and Mowix Algebra (Continued) 93 


that at least one multiplication step can be saved. In Example 5, the casiest way to expand 
the determinant is by the third column, which consists of the elements 1, 0, and 0. We 
could therefore have cvaluated it thus: 


123 
|4=05 _3)=-6-21=-27 


To sum up, the value of a determinant |.4| of order » can be found by the Laplace ex- 
pansion of amy row or any coltuma as follows: 


A 
|4| = Val, | [expansion by the ith row] 
yal 


a 
= Saylty | [expansion by the jth column] (5.8) 
i=l 























EXERCISE 5.2 
1, Evaluate the following determinants: 
813 4.0 2 abe 
(a)|4 0 1 {o {6 0°73 (|b ca 
603 8 2 3 cab 
12 3 1 1 4 x 35 9 
()|4.7 5 (@) |8 11 -2 “is oy 2 
369 o 4 7 a -1 8 




















2. Determine the signs to be attached to the relevant minors in order to get the following 
cofactors of a determinant: {Cqal, [C231, [C331, }Carl, and {C34I- 


abe 

de f , find the minors and cofactors of the elements a, b, and fF. 

gh ij 

4. Evaluate the following determinants: 

1 20 ¢ 27901 

2°34 6 5 648 

1 60 +1 ) o 090 
1@ -5 0 8 1-3 14 

5, In the first determinant of Prob. 4, find the value of the cofactor of the element 9. 

6. Find the minars and cofactors of the third row, given 


9 11 4 
A=|3 27 
6 10 4 


7. Use Laplace expansion to find the determinant of 


537 9 
A=| 25 6 
90 12 


3. Given 





| 
() 








94 Part Two Static for Equilibrium} Analysis 


5.3 Basic Properties of Determinants 





Example 1 


Example 2 


Example 3 


Example 4 


Example 5 


We can now discuss some properties of determinants which will enable us to “discover” the 
connection between linear dependence among the rows of a square matrix and the vanish- 
ing of the determinant of that matrix. 

Five basic properties will be discussed here. These arc properties common to determi- 
nants of all orders, although we shall illustrate mostly with second-order determinants: 


Property I The interchange of rows and columns does not affect the value of a determi- 
nant. In other words, the determinant of a matrix A has the same value as that of its 
transpose 4’, that is, |4| = |4’]. 


43/14 5]_ 

[5 s]=!3 6l=2 
a bl_‘ac 
c d[=|b d[=7- 











Property IT The interchange of any nvu rows {or any two columns) will alter the sign, but 
not the numerical value, of the determinant. (This property is obviously related to the first 
elementary row operation on a matrix.) 


\ 
i ‘| = ad — be, but the interchange of the two rows yields 
1 











cd 
ab = ¢cb- ad = —(ad — be) 
01 | 
2 5 7|=~—26, but the interchange of the first and third columns yields 
301! 
3°10 
7 S$ 2)=26. 
103 








Property IIL The multiplication of any one row (or one column) by a scalar 4 will change 
the value of the determinant k-fold. (This property is related to the second clementary row 
operation on a matrix.) 


By multiplying the top row of the determinant in Example 3 by &, we get 


ab 
cd 


ika kb 


MC d | =Rad— kbc = Kad — be) =k 





It is important to distinguish between the two expressions £4 and 4| 4]. In multiplying a 
matrix A by a scalar &, all the elements in A are to be multiplied by 4. But, if we read the 
equation in the present example from right to left, it should be clear that, in multiplying 
a determinant | A| by k, only a single row (or colunm) should be multiplied by &. This 


Example 6 


Example 7 


Example 8 


Chapter 5 Linear Models and Mairix Algebra (Continued) 95 


cquation, therefore, in effect gives us a rule for factoring a determinant: whenever any sin- 
gic row or column contains a common divisor. it may be factored out of the determinant. 


Factoring the first column and the second row in turn, we have 


ISa 7b Sa 7b 5a 7b 
12¢ 2d 4c 2d 2c d 


The direct evaluation of the original determinant will, of course, produce the same answer. 





3 = 6(Sad — 14bc) 








= 3(2) | 


In contrast, the factoring of a matrix requires the presence of a common divisor for al! 


its clements, as in 
ka kb] _ xl ¢ b 
he kd| "Le @ 


Property IV The addition (subtraction) ofa multiple of any row to (from) another row will 
leave the value of ihe determinant unaltered. ‘Vhe same holds true if we replace the word 
row by column in the previous statement. (This property is related to the third elementary 
row operation on a matrix.) 


Adding k times the top row of the determinant in Example 3 to its second row, we end up 
with the original determinant: 
| @ b 

ctka d+kb 


| = ote) — 8(6+ Ho) = ad — be = 2 iy 





Property V If onc row (or column) is a multiple of another row (or column), the value of 
the deternrinant will be zero. As a special case of this, when two rows (or two columns) are 
identical, (he determinant will vanish, 


2a 2b) _pab-2av=0 | © © 
a 6 


dd =cd-cd=0 











Additional examples of this type of “vanishing” determinant can be found in Exercise 5.2-1. 


This important property is, in fact, a lgical consequence of Property IV. To understand 
this, let us apply Property TV to the two determinants in Example 8 and watch the outcome. 
For the first one, try to sublract twice the second row from the top row: for the second 
determinant, subtract the second column from the first column. Since these operations do 
not alter the values of the determinants, we can write 


2a 7 0 H coe 


@ b ab dd 














Oe | 
10 d| 
The new (reduced) determinants now contain, respectively, a row and a column of zeros; 
thus their Laplace expansion must yield a valuc of zero in both cases. In general, when one 
row (column) is a multiple of another row (column), the application of Property [V can al- 
ways reduce all elements of that row (column) to zero, and Property V therefore follows. 

The basic properties just discussed are useful in several ways. For one thing, they can be 
of great help in simplifying the task of evaluating determinants. By subtracting multiples 
of one row (or column) from anothez, for instance, the elements of the determinant may be 


96 Part Two Static (or Equilibrium) Analysis 


reduced to much smaller and simpler numbers. Factoring, if feasible, can also accomplish 
the same. If we can indeed apply these propertics to transform some row or column into 4 
form containing mostly Os or 1s, Laplace expansion of the determinant will become a much 
more manageable task. 


Determinantal Criterion for Nonsingularity 

Our present concern, however, is primarily to link the linear dependence of rows with the 
vanishing of a determinant. For this purpose, Property V can be invoked. Consider an ¢qua- 
tion system Ax = d: 


3040 24) mu dq, 
1620 Wile lala 
4 0 1 x3 ds 


This system can have a unique solution if and only if the rows in the coefficient matrix 4 
are linearly independent, so that A is nonsingular. But the second row is five times the first, 
the rows are indeed dependent, and hence no unique solution exists. The detection of this 
row dependence was by visual inspection, but by virtue of Property V we could also have 
discovered it through the fact that |4| = 0. 

The row dependence in a matrix may, of course, 
pattern. For instance, in the matrix 


sume a more intricate and secretive 





412 
B=|5 2 1/=|% 
101 


there exists row dependence because 2v, — v; — 3v; = 0; yet this fact defies visual detec- 
tion. Even in this case, however, Property V will give us a vanishing determinant, |B| = 0, 
since by adding three times v4 to v; and subtracting twice v; from it, the second row can be 
reduced to a zero vector. In general, any pattern of linear dependence among rows will be 
reflected in a vanishing detcrminant—and herein lics the beauty of Property V! Conversely, 
if the rows are linearly independent, the determinant must have a nonzero value. 

We have, in the previous two paragraphs, tied the nonsingularity of a matrix principally 
to the linear independence among rows. But, on occasion, we have made the clatm that, for 
a square matrix A, row independence < column independence. We are now equipped to 
prove that claim: 


According to Property I, we know that |4| = |4’|. Since row independence in 4 = |4| #9, 
we may also state that row independence in 4 ¢ |4'| # 0. But |4'| # 0 < row indepen- 
dence in the transpose 1’ <> column independence in 4 (rows of A’ are by definition the 
columns of A), Therefore, vow independence in 4 <> cohen independence in 4. 


Our discussion of the test of nonsingularity can now be summarized. Given a lincar- 
equation system Ax = d, where 4 is an” x » coefficient matrix, 
| A| #0 there is row (column) independence in matrix A 
<> Aisnonsingular 
@ Aq! exists 


© a unique solution x* = A7'd exists 


Example 9 


Chapter 5 Linear Modets and Matrix Algebra (Continued) 97 


Thus the value of the determinant of the coefficient matrix, | Al, provides a convenient cri- 
terion for testing the nonsingularity of matrix A and the existence of a unique solution to 
the equation system dx = d. Notc, however, that the determinantal criterion says nothing 
about the algebraic signs of the solution values; Le., even though we are assured of'a unique 
solution when || #4 0, we may somctimes get negative solution values that are economi- 
cally inadmissible. 


Does the equation system 
7X) — 3x7 — 3x3 =7 
2x +442 + 23 =0 
-2x) — x3 =? 


possess a unique solution? The determinant |A| is 


7-3 -3| 
2 4 12-840 
0 +2 i| 


Therefore a unique solution does exist. 


Rank of a Matrix Redefined 

The rank of a matrix 4 was earlier defined to be the maximum number of linearly indepen- 
dent rows in A. In view of the link between row independence and the nonvanishing of the 
determinant, we can redefine the rank of an m » # matrix as the maximum order of a non- 
vanishing determinant that can be constructed from the rows and columns of that matrix. 
The rank of any matrix is a unique number. 

Obviously, the rank can at most be m or n, whichever is smaller, because a determinant 
is defined only for a square matrix, and from a matrix of dimension, say, 3 x 5, the largest 
possible determinants (vanishing or not) will be of order 3. Symbolically, this fact may be 
expressed as follows: 


r(A) <= min tm, 2} 


which is read: “The rank of 4 is less than or equal to the minimum of the set of two num- 
bers m and n.” The rank of ann x n nonsingular matrix 4 must be x; in that case, we may 
write r(A) =n. 

Sometimes, one may be interested in the rank of the product of two matrices. In that 
case, the following rule can be of use: 


(AB) < min {r(A),7(B)) (3.9) 


Whilc this rule does not yield a unique value of r(.4B), the application of the rule can nev- 
ertheless lead to unique results, [n particular, we can use (5.9) to show that if a matrix 4, 
with r(4} = j, is multiplied by any (conformable) nonsingular matrix B, the rank of the 
product matrix AB (or BA, as the case may be), must be /. We shall prove this for the prod- 
uct AB (the case of BA is analogous). First, looking at the right-hand side of (5.9), we see 
only three possible cases: (i) r(4) < 7(8), Gi} r(4) =r(B), and (iii) *(A) > (8). 





98 Part Two Static for Equilibrium) Analusis 


For cases (i) and (ii), (5.9) reduces directly to »(AB) S1(A) = j. For case (iii), we find 
that r(4B) < r(B} < r{A) = j. Thus, either way, we get 


r(AB) <r(Al=j (5.10) 
Now consider the identity (48)87' = 4. By (5.9), we can write 
r[(AB)87'] < min(r( AB), (B-')} 
Applying the same reasoning that led us to (5.10), we can conclude from this that 
r[(AB)B7'] < r(AB) 
Since the left-side expression of this incquality is equal to r(.4) = j, we may write 
fsr(aB) G.11) 


But (5.10) and (5.11) cannot be satisfied simultaneously unless r(48) = /. Thus the rank 
of the product matrix 42 must be j, as asserted. 





EXERCISE 5.3 

40-1 
21-7 
33 9 


2. Show that, when all the elements of an nth-order determinant | Al are multiplied by a 
number k, the result will be k”| A]. 


3. Which properties of determinants enable us to write the following? 


9 18} _|9 18 9 27 13 
27 sal-lo 2 4 |= 18|3 | 


4. Test whether the following matrices are nonsingular: 


f40 1 7-1 0 
(3) 19 1 -3 @/1 1 4 
13-3 -4 


1. Use the determinant to verify the first four properties of determinants. 








{g 














L711 0 
f 4-21 495 
)|-5 6 0 {| 301 
7 03 10 8 6 


5, What can you conclude about the rank of each matrix in Prob, 4? 

6. Can any of the given sets of 3-vectors below span the 3-space? Why or why not? 
@u2i 231 4 2 
()(8 13) [1 2 8] [-7 1 5] 

7. Rewrite the simple national-income modet (3.23) in the Ax =d format (with ¥ as 
the first variable in the vector x), and then test whether the coefficient matrix A is 
nonsingular. 

8. Comment on the validity of the following statements: 

(a) “Given any matrix A, we can always derive from it a transpose, and a determinant.” 
(b) “Multiplying each element:of ani n x n determinant by 2 will double the value of 
that determinant.” 


(© “If a square matrix A vanishes, then. we can be sure that the equation system 
AX = dis nonsingular.” 





Chapter S Linear Models and Maitix Algebra (Continued) 99 


5.4 Finding the Inverse Matrix 





Example 1 





Tf the matrix 4 in the linear-equation system 4x = ¢ is nonsingular, then A~' exists, and 
the solution of the system will be x* = A~'d. We have learned to test the nonsingularity of 
A by the criterion | 4| 3 0. The next question is, How can we find the inverse 4~' if 4 docs 
pass that test? 


Expansion of a Determinant by Alien Cofactors 
Before answering this query, let us discuss another important property of determinants. 


Property VI The expansion of a determinant by alien cofactors (the cofactors of a 
“wrong” row or column) always yields a value of zero. 











412 
If we expand the determinant }5 2 1] by using its first-row elements but the cofactors 
103 
of the second-row elements 
12 _|4 2] _ — (4 1 
icni= | 3|--3 Mal=|, 3)=10 ical = -[4 oj 


we get ayy|Ca1| + a12/C22| + a13/C23| = 4(-3) + 1(10) + 2(1) = 0. 


More generally, applying the same type of expansion by alien cofactors as described in 
a1 412 4y3 
421 G22 423 
43, G32 33 


Example | to the determinant |.4| = will yield a zero sum of products as 








follows: 


3 
Yarley| = ay lCn| + ay2lCr2| + a131Co3] 


f= 
ay 43 
432, 433 


ayy ay3 
G3) 433 


= 41 a — 4) (5.12) 





Qa 

















a3) 32 
SG 12433 Ay 3032 1 211412033 — 12413431 


= Gy 03432 + ay241343) = 0 


The reason for this outcome lies in the fact that the sum of products in (5.12) can be con- 
sidered as the result of the regular expansion by the second row of another determinant 
ay din hs 
ay di aa 
a3) 52-33 
two rows are identical. As an exercise, write out the cofactors of the second rows of | 4*| 
and verify that these are precisely the cofactors which appeared in (5.12) and with the 
correct signs. Since |A*| = 0, because of its two identical rows, the expansion by alien 
cofactors shown in (5.12) will of necessity yield a value of zero also. 


{4*)= , which differs from [4] only in its second row and whose first 








100 Part Two 


Static for Equilibrium) Analysis 


Property V1 is valid for determinants of all orders and applies when a determinant is 
expanded by the alien cofactors of any row or any column, Thus we may state, in general, 
that for a determinant of order » the following holds: 


n 
Say Cry = 0 G #7’) [expansion by ith row and 

f=l cofactors of ith row] 

(5.13) 


SY aylCy |=0 YS) [expansion by jth column and 

fel cofactors of /‘th column] 
Carefully compare (5.13) with (5.8). In the latter (regular Laplace expansion), the sub- 
scripts of aj, and of |C;,| must be identical in each product term in the sum. In the expan- 
sion by alien cofactors, such as in (5.13), on the other hand, one of the two subscripts (a 
chosen value of é’ or 7’) is inevitably “out of place.” 


Matrix Inversion 
Property VI, as summarized in (5.13), is of direct help in developing a method of matrix 
inversion, i.e, of finding the inverse of a matrix. 

Assume that an” x # nonsingular matrix 4 is given: 


@yy a2 et ay | 
ap G22 ay (|| £0) (5.14) 


Ore) 





Ani An. 8 Gam J 
Since each element of 4 has a cofactor |C;;|, it is possible to form a matrix of cofactors by 
replacing each element a;, in (5.14) with its cofactor |C3;|. Such a cofactor matrix, denoted 
by C = [|Cjj[], must also be 2 x 2. For our present purposes, howcver, the transpose of C 
is of more interest. This transpose C’ is commonly referred to as the adjaint of A and is 
symbolized by adj 4. Written out, the adjoint takes the form 








ICrit 1a Cal 
C sadjd= ICial |Crz Gaal (5.15) 
(ra) en DO . 

ICial [Con 





The matrices A and C’ are conformable for multiplication, and their product AC" is 
another n « # matrix in which each element is a sum of products. By utilizing the formula 
for Laplace expansion as well as Property VI of determinants, the product AC’ may be 


expressed as follows: 
a 


, 
aC) YraylCal -- Saryley! 


1 j=l j=l 


aalCijl Yoan iCal oY ary |Ca 
1 jel 


1 = 


M: 


Ms: 


AC = 


(axn) 


Yagil YraylCyl oS argleyl 
j=l j=l j=l 


Chapter 5 Linear Modets and Mazrix Algebra (Continued) 101 


4] O - 0 
0 ld 0 
=|. . . [by (5.8) and (5,13)] 
0 0 Al 
10-0 
alaifo bo Ol ajay factoring] 
00-8 1 


As the determinant || is a nonzero sealar, it is permissible to divide both sides of the 
equation 4C’ = | Al/ by [A]. The result is 
AC Cc 
=! or A 
|A| |4| 





Premultiplying both sides of the last equation by 47}, and using the result that 47! 4 = /, 


_ cal 
we can get — = A™ or 
\A| 


tet 


- a adj A [by (5.15)] (5.16) 


Now, we have found a way to invert the matrix 4! 

The general procedure for finding the inverse of'a square matrix 4 thus involves the fol- 
lowing steps: (1) find |4| [we need to proceed with the subsequent steps if and only if 
14| € 0, for if |.4| = 0, the inverse in (5.16) will be undefined]; (2) find the cofactors of all 
the elements of 4, and arrange them as a mattix C = [|C;;|]; (3) take the transpose of C1o 
get adj A; and (4) divide adj A by the determinant | 4|. The result will be the desired inverse 
4, 


Example 2 find the inverse of A= (; 0] . Since |A| = —2 ¥ 0, the inverse 47! exists. The cofactor 


of each element is in this case a 1 x 1 determinant, which is simply defined as the scalar 
element of that determinant itself (that is, |g;;| = a;;). Thus, we have 


c= fe et] -| 0 -1 
[Carl 1C22! -2 3 
Observe the minus signs attached to 1 and 2, as required for cofactors. Transposing the 
cofactor matrix yields 
7 Q -2 
adj A= [3 31 


$0 the inverse A~' can be written as 


1. 1f 0 -2 0 1 
oo toad _ 
A= Tq ata > [1 sle[! 3] 


102 Part Two Static (or Equilibritum) Analysis 








Example 3 41-1 
———_+-—— find the inverseof 8=|0 3 2]. Since |B] =99 40, the inverse B ! also exists. The 
30 7 
cofactor matrix is 
f j3 2) Jo 2 jo 3 
0 7 3 7° [30 
-| a I: -1| -|s i ff 3 
o 7| |3 7| [3 0 5 8 wD 
y -1) _|4 -1 4 | 
3. 2; “lo 2] |o 3 
Therefore, 
21 -7) 5 
adj B = 6 31 -8 
-9 3 12 
and the desired inverse matrix is 
1 1 1 21-7 5 
BU =— adjB=—| 6 31 -8 
Bi 8] 9 93 
You can check that the results in Examples 2 and 3 do satisfy 447! = 4°'A = # and 
BB-| = BB = 1, respectively. 
EXERCISE 5.4 


1. Suppose that we expand a fourth-order determinant by its third column and the cofac- 
tors of the second-column elements. How.would you write the resulting sum of prod- 
ucts in $° notation? What will be the sum of products in 5? notation if-we expand it by 
the second row and the cofactors of the fourth-row elements? 


2. Find the inverse of each of the following matrices; 


om] oa-[3 9] oc I] won § 


3. (a) Drawing on your answers to Prob. 2, formulate a two-step rule for finding the ad- 
joint of a given 2 x 2 matrix A: In the first step, indicate what should be done to the 
two diagonal elernents of A in order to get the diagonal elements of adj 4; in the 
second step, indicate what should be done to the two off-diagonal elements of A. 
(Warning: This rule applies only to 2 x 2 matrices.) 

{b) Add a third step which, in conjunction with the previous two steps, yields the 2 x 2 
inverse matrix Aq". 

4. Find the inverse of each of the following matrices: 

0 
1 
0 

0 

ie) 

1 


4-21 10 
@eE=|7 30 () G=|0 
2 01 0 


1-1 2 1 
@F=}1 03 @)H=]0 
4 0 2 0 


-° 


o-° 


Chapter $ Linear Models and Matrix Aégebra (Continued) 103 


5. Find the inverse of 


4 91-5 
A=|-2 3 1 
3-1 4 


6. Solve the system Ax = d by matrix inversion, where 
(a) 4x+3y=28 (bt) 4x + 42-53 =8 
2x-+ Sy = 42 2k) + 3x24 X32 12 
Bx — Xe t+ 4xzen 5 
7. ‘(sit possible'for a matrix to be its own inverse? 


5.5 Cramer’s Rule 





The method of matrix inversion discussed in Sec. $.4 enables us to derive a practical, if not 
always efficient, way of solving a linear-equation system, known as Cramer s rule. 


Derivation of the Rule 
Given an equation system 4x = d, where 4 is # x 1, the solution can be written as 


1 
wad ld= ia (adj Ad [by (5.16)] 


provided 4 is nonsingular. According to (5.15), this means that 








af Pleat Cul iat] Pa 
! 

3/1 | ical |e Gaal | | # 
: |4 deen e renee teens : 
LIinl [Cal Gaol | | dy 


aa 


FdlCul + ealCail tee del Cat 
LY} dCi] + aiCn| +++ + del Coal 


[ ditCin| + da|Caa) + + de Cont 
rt 
S alent 
i=l 
1 
1 | Sacral 
isl 





a 


DV 4lCnl 


j=l 


104 Part Two Static for Equilibriem) Analysis 


Example 1 


Equating the corresponding elements on the two sides of the equation, we obtain the solu- 
tion values 


Le a 
=z yall a= pVidlCal (te) 6.17) 
iS Ml 


The $* terms in (5.17) look unfamiliar, What do they mean? From (5.8), we sec thal the 
Laplace expansion of a determinant |A| by ils first column can be expressed in the form 


Salen |. If we replace the first column of || by the column vector d but keep all the 
i=l 
other columns intact, then a new determinant will result, which we can call | 4;| --the sub- 


script | indicating that the first column has been replaced by @. The expansion of |.4 | by its 


first column (the d column) will yicld the expression Valea |, because the elements ¢; 
in] 
now take the place of the clements a;;. Returning to (5.17), we sce therefore that 


rte a 
a A 


Similarly, if we replace the second column of | 4] by the column vector d, while retaining 
all the other columns, the expansion of the new determinant |.42| by its second column (the 
2 


d column) will result in the expression xy @|C|. When divided by |}, this latter sum 
will give us the solution value xj, an 0 on 

This procedure can now be generalized. To find the solution value of the jth variable x7, 
we can merely replace the jth column of the determinant || by the constant terms dj --- dy 
to get a new determinant |.4;| and then divide |4;| by the original determinant | 4]. Thus, 
the solution of the system 4x = d can be expressed as 


ayy ai rt dy ain 

[4] fan dn ee og 
ye =—]|, . 5.18) 
Fo |Al ldlpos 6.18) 

Gy Ag dy an 


(jth column replaced by d) 


The result in (5.18) is the statement of Cramer’s rule, Note that, whereas the matrix inver- 
sion method yields the solution values of aif the endogenous variables at once (x* 18 a vec 
tor), Cramer’s rule can give us the solution value of only a single endogenous variable at a 
time (x7 is a scalar); this is why it may not be efficient. 











Find the solution of the equation system 


5x1 + 3x2 = 30 
6x, -2x.= 8 
The coefficients and the constant terms give the following determinants: 
5 3 30,3 
Al= =- = = 84 
IAl= | 4 28 Ail | 8 4 8 





3 30 
aa=|s 8)=-10 


Chapter $ Linear Mudels and Matrix Algebra (Contiitued) 105 


Therefore, by virtue of (5.18), we can immediately write 








|Aj|  -84 |Ao}  -140 
~— L d x ee 
1 Tal > —38 ang = Tar = 3g 
Example 2 Find the solution of the equation system 
7M ~ X- = 0 
10 —2x24+ x3 =8 
6x] + 3x2 + 2x3 =7 
The relevant determinants | A] and (Ajj are found to be 
17 -1 a 0-1 -1 
jaje [10 ~2 1/=-61 inie|a -2 1/=-61 
6 3 -2l 7 3-2 
70 -1) 7-10 
|A2}=|10 8 1 —183 |43]}=]10 -2 8) =-244 
67 -2 6 37 











thus the solution values of the variables are 


tA 2 6F ew Aad 183 _ 
|Al 61 2 


Al _ -244 


x 


ia) er) 8 = Ta = a 








Notice that in each of these examples we find | 4| 4 0. This is a necessary condition for 
the application of Cramer’s rule, as it is for the existence of the inverse 4~'. Cramer’s rule 
is, after all, based upon the concept of the inverse matrix, even though in practice it by- 
passes the process of matrix inversion. 


Note on Homogeneous-Equation Systems 
The equation systems Ax = d considered before can have any constants in the vector d. [f 
d = 0, thatis, if@) =d@, =--.- =, = 0, however, the equation system will become 


Ax=0 


where 0 is a zero vector. This special case is referred to as a Aomogeneous-equation system. 
The word homogeneous here relates to the property that when all the variables, x Xs 
are multiplied by the same number, the equation system will remain valid, This is possible 
only if the constant Lerms of the system—those unattached to any x; arc all zero. 

If the matrix A is nonsingular, a homogeneous-equation system can yicld only a “trivial 











solution,” namely, xf = xj =--- =xf = 0. This follows from the fact that the solution 
x* = 47'd will in this case beeome 

xed! os 

(x1 Owed) — God) 


Alternatively, this outcome can be derived from Cramer's rule. The fact that ¢ = 0 implies 
that |A;|, for all 7, must contain a whole column of zeros, and thus the solution will tum 
out to be 

~_l4/_ 9 


x= =—=0 G=1,2,...,") 
fat tA 


106 Part Two 


TABLE 5.1 
Solution 
Outcomes 

for a Linear- 
Equation 
System Ax = d 


Static tor Equilibrium; Analysis 


Curiously enough, the only way to get a nontrivial solution from a homogeneous- 
equation system is to have || = 0, that is, to have a singular coefficient matrix 4! In that 
event, we have 





where the 0/0 expression is not equal to zcro but is, rather, something undefined. Conse- 
quently, Cramer’s rule is not applicable. This does not mean that we cannot obtain solu- 
tions; it means only that we cannot get a unique solution. 

Consider the homogeneous-cqualion system 


ax Fax =0 
aa1X1 + ak = 0 6.19) 


[tis self-evident that.x} =x} = 0 isa solution, but that solution is trivial. Now, assume that 
the coefficient matrix 4 ingular, so that |4| = 0. This implies that the row vector 
fai) ayy] is a multiple of the row vector [@21 a22]; consequently, one of the two equa- 
tions is redundant. By delcting, say, the second equation from (5.19), we end up with one 
(the first) equation in two variables, the solution of which is xf = (-@a/ai xp. This 
solution is nontrivial and well defined if aj, 4 0, but it really represents an infinite number 
of solutions because, for every possible value of xj, there is a corresponding valuc 2} 
such that the pair constitutes a solution. Thus no unique nontrivial solution exists for this 
homogeneous-equation system. This last statement is also gencrally valid for the #-variable 
case. 








Solution Outcomes for a Linear-Equation System 
Qur discussion of the several variants of the linear-equation system Ax = d reveals that as 
many as four different types of solution outcome are possible, For a better overall view of 
these variants, we list them in tabular form in Table 5.1. 

As a first possibility, the system may yield a unique, nontrivial solution. This type of 
outcome can arise only when we have a nonhomogeneous system with a nonsingular coc{- 
ficient matrix 4. The second possible outcome is a unique, trivial solution, and this is 








Vector 
d#0 d=0 
Determinant |A] (nonhomogeneous system) (homogeneous system} 
|A| #0 There exists a unique, There exists a unique, 
(matrix A nonsingular) nontrivial solution x* # 0. trivial solution x* = 0. 
|A] = 0 
(matrix A singular) 
Equations dependent There exist an infinite There exist an infinite 
number of solutions (not number of solutions 
including the trivial one), {including the trivial one). 


Equations inconsistent’ No solution exists, [Not possible.] 





Chapter 5 Linear Models ane Matrix Algefa (Continued) 107 


associated with a homogencous system with a nonsingular matrix A. As a third possibility, 
we may have an infinite number of solutions, This eventuality is linked exclusively to a sys- 
tem in which the equations arc dependent (i.¢., in which there are redundant equations). 
Depending on whether the system is homogeneous, the trivial solution may or may not be 
included in the set of infinite number of solutions. Finally, in the case of an inconsistent 
equation system, there cxists no solution at all. From the point of view of a model builder, 
the most useful and desirable outcome is, of course, that of a unique, nontrivial solution 





ate, 
EXERCISE 5.5 
1. Use Cramer’s rule to solve the following equation systems: 
(a} Bay —2xy = 6 (Q 8x1 —7x2 =9 
‘ant 2 =11 Xb xp =3 
{b} —x1 + 3x2 = -3 (@) 5x1 + 9x2 = 14 
4n — x =12 7x) 3 = 4 


5.6 


2. For each of the equation systems in Prob. 1, find the inverse of the coefficient matrix, 
and get the solution by the formula x” = A~'d. 
3. Use Cramer’s rule to solve the following equation systems: 


(a) 8x) — x2 =16 (9 4x+3y-2z2=1 

2x +5x3= 5 x+2y =6 

2x +3x3= 7 3x + 2=4 
(8) -91 +342+2%3=24 9 (@) -xt+y+z=a 
X% + = 6 x~y+z=b 
S5x2- x35 8 xty-z=c¢ 


4. Show that Cramer's rule can be derived alternatively by the following procedure. Mul- 
tiply both sides of the first equation in the system Ax = d by the cofactor |C1 jj, and 
then multiply both sides of the second equation by the cofactor |C2;|, ete. Add all the 
newly obtained equations. Then assign the values 1,2,...,4 to the index j, succes. 
sively, to get the solution values x7, x3, .,., x4 as shown in (3.17). 





Application to Market and National-Income Models 





Simple equilibrium models such as those discussed in Chap. 3 can be solved with ease by 
Cramer's rule or by matrix inversion. 


Market Model 
The two-commodity model described in (3.12) can be written (after eliminating the quan- 
tity variables) as a system of two Linear equations, as in (3. 13"): 

erP +P) = en 

WP ty P=—y 


108 Part Two Static: for Equilibrium) Analysis 


The three determinants needed—| 4], | 4)|, and |.42|—have the following values: 








cy ey 

Al= Sci -¢ 

Al yoy [pee 
= 

Als Sante. 

|Ai] zs 2 ov + ¢2¥0 
a ey 

4,|= = ey +c 

|A2| ” -¥ 1Y0 + coy 








Therefore the equilibrium prices must be 
Pr = Idi] _ exyo — cove Pra [4a] _ Co¥ — Yo 
|Al civ, c2¥1 dl cin — er 
which are precisely those obtained in (3.14) and (3.15). The equilibrium quantities can be 
found, as before, by sctiing P; = P* and P; = Py in the demand or supply functions. 





National-Income Model 
The simple national-income model cited in (3.23) can also be solved by the use of Cramer's 
rule, Ag written in (3.23), the model consists of the following two simultaneous cquations: 

Y=C+ht+Go 

C=a+bY (> G<b<1) 
These can be rearranged into the form 

Y-CH=h+G 
—bY+C=a 

so that the endogenous variables Y and C appear only on the left of the equals signs, 
whereas the exogenous variables and the unatlached parameter appear only on the right. 
The coefficient matrix new takes the form tl , and the column vector of 


ln +Go —b 1 
a 


constants (data), [ . Note that the sum /y + Go is considered as a single entity, 


ie., a single element in the constant vector. 
Cramet’s rule now leads immediately to the following solution: 














(o+ Go) 1 
yre a 1 _ fot Gute 
1 -l l-b 
-b 1 
| (lo + Go) 
cele | _ at lo + Go) 
- 1-1) I=h 
—b 1 
You should check that the solution values just obtained are identical with those shown in 


(3.24) and (3,25). 
Let vs now try to solve this model by inverting the coefficient matrix. Since the 


tl |: its cofactor matrix is [i a and we therefore 


coefficient matrix is 4 = [i I 11 


Chapter 5 Linear Models and Matrix Algebra (Continued) 109 


have adj 4 = [: ! | _ It follows that the inverse matrix is 





alto, tft 
A! = —adjA = 
ial S[5 i| 


We know that, for the equation system 4x = d, the solution is expressible as.x* = A7!d. 
Applied to the present model, this means that 


yy] 1 1] ] 4G 1 hh+Gota 
cl bod a —b | b( + Go) te 
It is easy to see that this is again the same solution as obtained befere. 


IS-LM Model: Closed Economy 
As another linear model of the economy, we can think of the economy as being made up of 
two sectors: the real goods sector and the monetary sector. 

The goods market involves the following equations: 








Y=C+/4+G 
C=a+bl—-or 
f=d-ei 
G=Gy 


The endogenous variables are ¥, C, 7, and i (where / is the rate of interest). The exogenous 
variable is Go, while a, d, e, b, and f are structural parameters. 
Tn the newly introduced money market, we have; 


Equilibrium condition: M, = M, 
Money demand: My = kY —Ti 
Money supply: @. = Mo 


where Mo is the exogenous stock of money and k and / are parameters. These three equa- 
tions can be condensed into: 


My =kY -fi 


Together, the two sectors give us the following system of equations: 


¥-C-J=Gy 
bl =)Y—C=-a 
itead 

KY —li= Mo 


Note that by further substitution the system could be further reduced to a 2 x 2 system 
of equations. For now, we will leave it as a 4 x 4 system. In matrix form, we have 


1 -1 -1 o|/¥ Go 
bl-t} -1 0 OF/C}]_|-«@ 
0 Oo Ll elfif| ja 
k 0 oo +} Le My 


10. Part Two. Siatie (or Equilibrium, Anabvsis 


To find the determinant of the coefficient matrix, we can use Laplace expansion on one 
of the columns (preferably one with the most zeros). Expanding the fourth column, we find 


L -1 -l 1 -1 -l 
dl =(-e)/pa-1) -1 0/-4/sa-p -1 oO 
k 0 0 0 ool 


-1 -1 1 -t 
= 0K |} olan en | 
= ek —I[(-1) -(-1)0 — )] 
=ek+ifl—- 61-9) 
We can use Cramer's rule to find the equilibrium income ¥*. This is donc by replacing 
the first column of the coefficient matrix A with the vector of exogenous variables and tak- 
ing the ratio of the determinant of the new matrix to the original determinant, ot 


G ~l -l 0 
-a -l 0 0 
d 0 Loe 
«_ lA _ | 9 0 =I) 
~ |Ap ek F416 - 9] 





Using Laplace expansion on (he second column of the numerator produces 





~¢ 0 0 Gy -l 0 

Cu} a 1 ef CHEN a 1 oe 
a My 0 -1 My 0-1 
~ ek +i —bC1 —2)] ek +1[] — BC = a)] 


-a 8 OU Ge -l 0 

d 1 el-|d le 

Mo 0 -i| |My 0 -I 
ek +?{1- 51-8] 


By further expansion, we obtain 
-a 0 3)¢ e 
eine { De ya 
ek +11 —A(1— A] 

__ al —[d(—l) — eM] ~ Gol) 

~ ek +i[1 —bCl —f)] 

_ Matd+Go)+eMo 

~ ek +f{L—b(1-)] 
Since the solution to Y* is linear with respect to the exogenous variables, we can rewrite 
Y* as 





4 
(1) HM 





is il 

















Yr= 





a @ , 1 
ms (5 +i — 60 =) Mo+ (am — et a) ura Ge) 


Chapter S Linear Models and Matrix Algebra (Continued? 111 


In this form, we can sec that the Keynesian policy multipliers with respect to the moncy 
supply and government expenditure are the coefficients of Mo and Go, that is, 


ek +11 -b1 -4)) 





Money-supply multiplier: 


and 
i 


Government-expendilure multiplier: Fill Wl] 


Matrix Algebra versus Elimination of Variables 

The cconomic models used for illustration above involve two or four cquations only, and 
thus only fourth or lower-order determinants need to be evaluated. For large equation sys- 
toms, higher-order determinants will appear, and their evaluation will be more compli- 
cated. And go will be the inversion of large matrices. From the computational point of view, 
in fact, matrix inversion and Cramet’s rule are not necessarily more efficient than the 
method of successive eliminations of variables. 

However, matrix methods have other merits. As we have seen from the preceding 
pages, matrix algcbra gives us a compact notation for any linear-cquation system, and 
also furnishes a determinantal criterion for testing the existence of a unique solution. These 
are advantages not otherwise available. In addition, it should be noted that, unlike the 
elimination-of-variable method, which affords no means of analytically expressing 
the solution, the matrix-inversion method and Cramer's rule do provide the handy solution 
expressions x* = 4—'d and x} = |4;1/|A], Such analytical expressions of the solution arc 
useful not only because they arc in themselves a summary statement of the actual solution 
procedure, but also because they make possible the performance of further mathematical 
operations on the solution as written, if called for. 

Under cettain circumstances, matrix methods can even claim a computational advan- 
tage, such as when the task is to solve at the same time several equation systems having 
an identical coefficient matrix A but different constant-term vectors. In such cases, the 
elimination-of-variable method would require that the computational procedure be re- 
peated each time a new equation system is considered, With the mairix-inversion method, 
however, we are required to find the common inverse matrix 47! only once, then the same 
inverse can be used to premultiply all the constant-term vectors pertaining to the various 
equation systems involved, in order to obtain their respective solutions. This particular 
computational advantage will take on great practical significance when we consider the 
solution of the Leontief input-output models in Sec. 5.7. 





EXERCISE 5.6 


1, Solve the national-income model in Exercise 3.5-1: 
{a) By matrix inversion —_(b) By Cramer’s rule 
{List the variables in the order ¥, GT.) 

2. Solve the national-income model in Exercise 3.5-2; 
(a) By matrix inversion = (6) By Cramer's rule 
(List the variables in the order ¥, C, G.) 


W2 Part Two | Static (or Equitibriwn) Analysis 


3. Let the IS equation be 
A g 

“T-b 1-6 
where 1 — bis the marginal propensity to save, g is the investment sensitivity to inter- 
est rates, and Ais an aggregate of exogenous variables. Let the LM equation be 
Y= * + ii 
where k and / are income and interest sensitivity of money demand, respectively, and 
Mg is real money balances. 

Ifb=0.7, g = 100, A = 252, k =0.25, / = 200, and Mg = 176, then 
(a) Write the IS-LM system in matrix form. 


(b) Solve for ¥ and i by matrix inversion. 


5.7__Leontief Input-Output Models 





In its “static” version, the input-output analysis of Professor Wassily Leontief, a Nobel 
Prize winner,’ deals with this particular question: “What level of output should each of the 
n industries in an economy produce. in order that it will just be sufficient to satisfy the total 
demand for that product?” 

The rationale for the term input-output analysis is quile plain to see. The output of any 
industry (say, the stecl industry) is needed as an input in many other industries, or even for 
that industry itself; therefore the “correct” (i.¢., shortage-free as well as surplus-frec} level 
of steel output will depend on the input requirements of all the # industries. In turn, the oul- 
put of many other industries will enter into the stecl industry as inputs, and consequently 
the “correct” levels of the other products will in turn depend partly upon the input require- 
ments of the steel industry. In view of this interindustry dependence, any set of “correct” 
output levels for the x industries must be one that is consistent with all the input require- 
ments in the economy, so that no bottlenecks will arise anywhere. In this light, it is clear 
that input-output analysis should be of great use in production planning, such as in plan- 
ning for the cconomic development of a country or for a program of national defense. 

Strictly speaking, input-output analysis is not a form of the general equilibrium analysis 
as discussed in Chap. 3. Although the interdependence of the various industries is empha- 
sized, the “correct” output levels envisaged are those which satisfy technical input-output 
relationships rather than market equilibrium conditions, Nevertheless, the problem posed 
in input-output analysis also boils down to one of solving a system of simultaneous equa- 
tions, and matrix algebra can again be of service. 








Structure of an Input-Output Model 

Since an input-output model normally encompasses a large number of industries, its frame- 
work is of necessity rather involved. To simplify the problem, the following assumptions arc 
as a tule adopted: (1) each industry produces only one homogeneous commodity (broadly 
interpreted, this docs permit the case of two or more joinily produced commoditics, 


t Wassily W. Leontief, The Structure of American Economy 1919-1939, 2d ed., Oxford University Press, 
Fair Lawn, N.J., 1951. 


TABLE 5.2 
Toput- 
Coefficient 
Matrix 


Chapter 5 Linear Models and Matrix Algebra (Continued) 113 





Output 
Input l Io oHE «ON 
I m1 2 430+ Din 
Hl 03, 32 BR Dp 
It G32 033 Oe 
N Ont Gna Ons -** Inn 


provided they ace produced in a fixed proportion to onc another); (2) cach industry uses a 
fixed input ratio (or factor combination) for the production of its output; and (3) production 
in every industry is subject to constant returns 10 scale, so that a A-fold change in every 
input will result in an exactly 4-fold change in the ourput. These assumptions are, of course, 
unrealistic. A saving grace is that, ifan industry produces two different commodities or uses 
two different possible factor combinations, then that industry may——at least conceptually— 
be broken down into two separate industries. 

From (hese assumptions we see that, in order to produce each unil of the jth commodity, 
the input need for the éth commodity must be a fixed amount, which we shall denote by aj,. 
Specifically, the production of each unit of the th commodity will require a); (amount) of 
the first commodity, az; of the second commodity, ..., and a, of the ath commodity. (The 
order of the subscripts in a); is easy to remember: The first subscript refers to the input, and 
the second to the output, so that ay, indicates how much of the ith commodity is used for the 
production of each unit of the jth commodity.) For our purposes, we may assume prices to 
be given and, thus, adopt “a dollar’s worth” of each commodity as its unit. Then the state- 
ment 432 = 0.35 will mean that 35 cents’ worth of the third commodity is required as an 
input for producing a dollar’s worth of the second commodity. The @;; symbol will be re- 
ferred to as an input coefficient. 

For an a-industry economy, the input coefficients can be arranged into a matrix 
A =[aj;], as in Table 5.2, in which each column specifies the input requirements for the 
production of ong unit of the output of a particular industry. The second column, for exam- 
ple, states that to produce a unit (a dollar's worth) of commodity I], the inputs needed are: 
432 units of commodity I, a, units of commodity II, etc. If no industry uses its own prod- 
uct as an input, then the elements in the principal diagonal of matrix A will all be zero. 


The Open Model 
Ifthe n industries in Table 5.2 constitute the entirety of the economy, then all their products 
would be for the sole purpose of meeting the input demand of the same » industries (to be 
used in further production) as against the final demand (such as consumer demand, not for 
further production). At the same time, all the inputs used in the cconomy would be in the 
nature of intermediate inputs (those supplied by the n industrics) as against primary inputs 
(such as labor, not an industrial product). To allow for the presence of final demand and pri- 
mary inputs, we must include in the model an open sector outside of the a-industry net- 
work, Such an open sector can accommodate the activities of the consumer households, the 
government sector, and even foreign countries. 

in view of the presence of the open sector. the sum of the elements in each column of 
the input-coefficient matrix A (or input matrix A, for short) must be less than 1. Each 


114 Part Two Static for Equilibrium) Analysis 


column sum represents the partial input cost (not including the cost of primary inputs) 
incurred in producing a dollar’s worth of some commodity; if this sum is greater than or 
equal to $1, therefore, production will not be economically justifiable. Symbolically, this 
fact may be stated thus; 


2 
Yoajp<i GLB a) 
i=l 


where the summation is over /, that is, over the elements appearing in the various raws ofa 
specific column j. Cattying this line of thought a step further, it may also be stated that, 
since the value of output ($1) must be fully absorbed by the payments to all factors of 
production, the amount by which the column sum falls short of $1 must represent the pay- 
ment to the primary inputs of the open sector. Thus the value of the primary inputs needed 


in producing a unit of the th commodity should be 1 — Yo ay. 
: . . =I . . 
Tf industry I is to produce an output just sufficient to meet the input requirements of the 

n industries as well as the final demand of the open sector, its output level x; must satisfy 
the following equation: 

ey ayid, bayt2 bo bande +h 
where @, denotes the final demand for its output and a) jx; represents the input demand 
from the jth industry." By the same token, the output levels of the other industries should 
satisfy the cquations 

Xo = aptr + aaoX2 +--+ + dank, + 





Fy = My XH yt be Fan Xy + dy 


After moving all terms that involve the variables x; to the left of the equals signs, and leav- 
ing only the exogenously determined final demands d; on the right, we can express the 
“correet” output levels of the industries by the following system of » linear equations: 
(Lau ja - @p2¥2— 00 Aykn = ay 
ayy + (1 = agg) — Ay, Xn = (5.20) 





Ay X] > aypty— 0 + (1 


Tn matrix notation, this may be written as 


(hay) aig le x d, 
—m, (Loan) ~ en 2 _ a (5.209 
= ay) Gy. (1 an) J Le a, 


If the 1s in the principal diagonal of the matrix on the left are ignored, the matrix is 
simply —A = [—a;;]. Asit is, on the other hand, the matrix is the sum of the identity matrix 


‘Do not ever add up the input coefficients across a row; such a sum—say, ayy + 012 +++ + Qia— 
is devoid of any useful economic meaning. The sum of the products a11x1 4 a2x2+-°-+ MnXn 
on the other hand, does have an ecanomic meaning; it represents the total ammount of x, needed as 
input for all the n industries. 


Chapter 5 Linear Models and Matrix Algebra (Continued) 118 


T, @with Ls in its principal diagonal and with 0s everywhere else) and the matrix — 4. Thus 
(5.20) can also be written as 
(i-Aje =a (5.20") 

where x and d are, respectively, the variable vector and the final-demand (constant-term) 
vector. The matrix 7 — 4 is called the Leontief matrix. As long as J — A is nonsingular, we 
shall be able to find its inverse (/ ~ 4)~', and obtain the unique solution of the system 
from the equation 

x*=(1—A)'d (5.21) 


A Numerical Example 
For purposes of illustration, suppose that there are only three industries in the economy and 
one primary input, and that the input-cocfiicicnt matrix is as follows (let us use decimal 
values this time): 

a a2 43 0.2 0.3 0.2 

A=| a, on a3|=!]04 O1 02 (5.22) 

a, 32433 0.1 0.3 0.2 
Note that cach colurnn sum in A is fess than L, as it should be. Further, if we denote by ao; 
the dollar amount of the primary input used in producing a dollar’s worth of the jth eom- 
modity, we can write [by subtracting each column sum in (5.22) from 1]: 


ay, =0.3 ay =03 and ayy = 0.4 (5.23) 


With the matrix 4 of (5,22), the open input-output system can be expressed in the form 
(/ — A)x =d as follows: 


08 03 0.2] fx a, 
-04 09 -02}}m]f=|a (5.24) 
0.1 03 08) bx a 


Here we have deliberately nat given specific values to the final demands d, ¢2, and 45. In 
this way, by keeping the vector d in parametric form, our solution will appear as a “formula” 
into which we can feed various specific d vectors to obtain various corresponding specific 
solutions. 

By inverting the 3 x 3 Leontief matrix, the solution of (5.24) can be found, approxi- 
mately (because of rounding of decimal figures), to be 


x 1 [0.66 0.30 0.24] fA 
xj |=U-Ayld= gam | 034 0.62 024] | & 
xt , 0.21 0.27 0.60) Ld 
If the specific final-demand vector (say, the final-output target of a development program) 
10 
happensto bed =} 5 |, in billions of dollars, then the following specific solution values 
6 
will emerge (again in billions of dollars): 
1 9.54 
xt [0.66(10) + 0.30(5) + 0.24(6)] = —— = 24.84 


> 9384 0.384 


116 


Part Two 


Static (or Equilibrium} Analysis 


and similarly, 
7.94 7.05 
x} =—_ = 20.68 dx} = = 18.36 
“7 0384 ae 3 © 0384 
An important question now ariscs. The production of the output mix x}. .x3,and x} must 
entail a definite required amount of the primary input. Would the amount required be con- 
sistent with what is available in the economy? On the basis of (5.23), the required primary 
input may be calculated as follows: 





3 
Ye aopx} = 0.3(24.84) + 0.3(20.68) + 0.4(18.36) = $21.00 billion 
jot 
10 
Therefore, the specific finaldemandd = | 5 | will be feasible if and only if the available 
6 
amount of the primary input is at least $21 billion. If the amount available falls short, then 
that particular production target will, of course, have to be revised downward accordingly. 
One notable feature of the previous analysis is that, as long as the input cocflicients 
remain the same, the inverse (7 — 4)~! will not change; therefore only one matrix inver- 
sion needs to be performed, even if we are to consider a hundred or a thousand different 
final-demand vectors—such as a spectrum of alternative development targets. This econo- 
mizes the computational effort as compared with the climination-of-variable method. How- 
ever, this advantage is not shared by Cramer's rule as outlined in (5.18), because each time 
a different final-demand vector d is used, we must calculate a new determinant as the nu- 
merator in (5.18), which is not as simple as multiplying a known inverse matrix (7 — 4 yt 
by a new vector 


The Existence of Nonnegative Solutions 

In the previous numerical example, the Leontief matrix 7 — A happens to be nonsingular, 
so solution values of output variables x; do exist, Moreover, the solution values xy all turn 
out to be nonnegative, as economic sense would dictate. Such desired results, however, 
cannot be expected to emerge automatically; they come about only when the Leontief 
matrix possesses certain properties. These properties are described in the so-calted 
Hawkins-Simon condition. 

To explain this condition, we need to introduce the mathematical concept of principal 
minors of a matrix, because the algebraic signs of principal minors will provide important 
clues in guiding our analytical conclusions. We already know that, given a square matrix, 
say, B, with determinant |8|, a minor is a subdeterminant obtained by deleting the /th row 
and jth column of |B], where i and / arc not necessarily equal, If we now impose the re- 
striction that é = 7, then the resulting minor is known as a principal minor. For example, 
givena3 x 3 matrix B, we can write its determinant generally as 


bi biz bis 
IB =] bp, Byz bo3 (5.25) 
dsr fa. 3a 


+ David Hawkins and Herbert A. Simon, “Note: Some Conditions of Macroeconomic Stability,” 
Econometrica, July-October, 1949, pp, 245-48, 


Chapter 5 Linear Models and Matrix Algebra (Continued) 117 


The simultaneous deletion of the ith row and the ith column (i = 3, 2, 1, successively) 
results in the following three 2 x 2 principal minors: 


bn ba bn bp 
boy bz by 5s 


by hy: 
by B33 


In view of their 2 x 2 dimensions, these are referred to as second-order principal minors. 
We can also generate first-order principal minors (1 x 1) by delcting any two rows and the 
same-numbered columns from |B|. They are 


lon] = dn |622| = Br ba3| = bas 


(5.26) 

















(5.27) 





Finally, to complete the picture, we can consider |B| itself as the third-order principal 
minor of |B|. Note that in all the minors listed in (5.25) through (5.27), their principal- 
diagonal elements consist exclusively of the principal-diagonal elements of 8. Herein lies 
the rationale for the name “principal minors,” 

While certain cconamic applications require checking the algebraic signs of al! the prin- 
cipal minors of a matrix 8, quite often our conclusion depends only on the sign patiern of 
a particular subset of the principal minors refcrred to variously as the /eading principal mi- 
nors, naturally ordered principal minars, or successive principal minors, In the 3 x 3 case, 
this subset consisis only of the first members of (5.25) through (5.27): 


bn bn by 
by by bys 
by by bys 


Here, the single subscript # in the symbol |8,,|, unlike in the subscript usage in the context 
of Cramer's rule, is employed to indicate that the leading principal minor is of dimension 
m » m, An easy way to derive the leading principal minors is to section off the determinant 
|B| with the successive broken lines as shown: 





by big 
by bx 











[Bil= lou] [Bl = [By (5.28) 








(5.29) 





Taking the top element in the principal diagonal of || by itself alone gives us | B)|; taking 
the first two elements in the principal diagonal, #1) and 432, along with their accompanying 
off-diagonal elements yields |Bo[; and so forth. 


+ An alternative definition of principal minors would allow for the various permutations of the 
subscript indices j, j, and k. This would mean, in the input-output context, the renumbering of the 
industries (¢.g., the first industry becomes the second industry, and vice versa, so that the subscript 
11 becomes 22, and the subscript 22 becomes 11, and so an). As a result, in addition to the 2 x 2 
principal minors in (5.26), we would also have 


boo ba! baz B31 
by bn di bu 
But these last three, in the order given, exactly match the three listed in (5.26) in value and algebraic 
sign; thus they can be omitted from cansideration for our purposes. Similarly, even though the 


permutation of subscript indices can generate additional 3 x 3 principal minors, they merely 
duplicate the one in (5.25) in value and sign, and thus can also be disregarded. 


'b33 32 


and 
bx baz 











118 Part Two | Static for Equilibrium) Analysis 


Given a higher-dimension determinant, say, 7 x #, there will of course be a larget oum- 
ber of principal minors, but the pattern of their construction is the same, A Ath-order prin- 
cipal minor is always obtained by deleting any n — & rows and the same-numbered columns 
from |B]. And its leading principal minors |B,,| Gvith # = 1,2,....”) are always formed 
by taking the first 7 principal-diagonal clements in |8| along with their accompanying off- 
diagonal elements, 

With this background, we are ready to state the following important theorem due to 
Hawkins and Simon: 


Given (a) ana x # matrix B, with Bj; = Oi # /) (.e., with all off-diagonal elements non- 
positive), and (db) ana x T veclord > 0 (all elements nonnegative), there exists ana x | 
vector x* = Q such that Bx" = d, ifand only if 

[Bal > 0 (m= 1,2,...,") 
ic, iPand only if the leading principal minors of # are all positive. 


The relevance of this theorem to input-output analysis becomes ¢lear when we let 8 repre- 
sent the Leontief matrix J — A (where 6,,; = —a,; fori # j are indecd all nonpositive), and 
d, the final-demand vector (where all the elements are indecd nonnegative). Then Bx* = d 
is equivalent ta (7 — A)x* = d, and the existence of x* > 0 guarantees nonnegative solu- 
tion output levels. The necessary-and-sufficient condition for this, known as the Hawkins- 
Simon condition, is that all the principal minors of the Leontief matrix { — 4 be positive. 

The proof of this theorem. is too Jengthy to be presented here,” but it should be warth- 
while to explore its economic meaning, which is relatively easy to see in the simple two- 
industry case (# = 2). 


Economic Meaning of the Hawkins-Simon Condition 
For the two-industry case, the Leontief matrix is 


jaan [15 412 
—a, Laz 
The first part of the Hawkins-Simon condition, |B)| > 0, requires that 


l-ay,>0 or ay<l 


Economically, this requires the amount of the first commodity used in the production of a 
dollar's worth of the first commodity to be less than one dollar. The other part of the condi- 
tion, |B:| > 0, requires that 


CL — a )(l = do3} — ainda > 0 


* 4 thorough discussion can be found in Akira Takayama, Mathematical Economics, 2d ed., Cambridge 
University Press, 1985, pp. 380-385. 

Some writers use an allernalive version of the Hawkins-Simon condition, which requires ai! the 
principal minors of 8 (not only the leading ones) to be positive. As Takayama shows, haweves, in 
the present case, with the special restriction on |B], it happens that requiring the positivity of the 
leading principal minors (a fess stringent condition) can achieve the same result. Nevertheless, it 
should be emphasized that, as a general rule, the fact that the leading principal minors satisfy a 
particular sign requirement does not guarantee that all the principal minors automatically satisfy that 
requirement, loo. Hence, a condition stated in terms of aif the principal minors must be checked 
against aff the principal minors, not only the leading ones. 


Chapter 5 Linear Models and Matrix Algebra (Continued) 119 


or, equivalently, 
4) +4)242, + (1-ay)ay <1 
Further, since (1 — a1) a2 is positive, the previous inequality implies that 
ap +a@i2aq) <1 


Feonomically, a, measures the direct use of the first commodity as input in the production 
of the first commodity itself, and a)2a2) measures the indirect use—it gives the amount of 
the first commodity needed in producing the specific quantity of the second commodity 
that goes into the production of a dollar’s worth of the first commodity. Thus the last in- 
equality mandates that the amount of the first commodity used as direct and indirect inputs 
in producing a dollar’s worth of the commodity itself, must be less than one dollar. Thus, 
what the Hawkins-Simon condition does is to specify certain practicability and viability re- 
strictions for the production process. If and only if the production process is economically 
practicable and viable, can it yield meaningful, nonnegative solution output levels. 





The Closed Model 

If the exogenous sector of the open input-output model is absorbed into the system as just 
another industry, the model will become a closed model. In such a model, final demand and 
primary input do not appear; in their place will be the input requirements and the output of 
the newly conecived industry. All goods will now be intermediate in nature, because cvery- 
thing that is produced is produced only for the sake of satisfying the input requirements 
of the (7 + 1) industries in the model. 

At first glaneg, the conversion of the open sector into an additional industry would 
not sccm te create any significant change in the analysis. Actually, however, since the new 
industry is assumed to have a fixed input ratio as does any other industry, the supply of what 
uscd to be the primary input must now bear a fixed proportion to what uscd to be called the 
final demand. More concretely, this may mean, for example, that households will consume 
each commodity in a fixed proportion to the labor service they supply. This certainly con- 
stitutes a significant change in the analytical framework involved. 

Mathematically, the disappearance of the final demands means that we will now have a 
homogeneous-equation system. Assuming four industries only (including the new one. des- 
ignated by the subscript 6), the “correct” output levels will, by analogy to (5.20'), be those 
which satisfy the equation system: 


(1 =doo) — 4o1 —ao2 —a03 xo 0 
~ay Ua) -an a3 x] _] 9 
ax) a (lay) ag x2 0 
ayy ay a2 (1-au) 1 Las 9 


Because this equation system is homogencous, it can have a nontrivial solution if and only 
if the 4 x 4 Leontief matrix { — 4 has a vanishing determinant, The latter condition is 
indeed always satisfied: In a closed model, no primary input exists; hence each column sum 
in the input-coefficient matrix A must now be exactly equal to (rather than less than) 1; that 
is, ayy + aj +42; +43; = 1,08 





ag; = 1 — a1; — ay; 23; 


120 Part Two Static (or Equilibrium) Analysis 


But this implies that, in every column of the matrix / — A, given previously, the top ele- 
ment is always equal to the negative of the sum of the other three elements, Consequently, 
the four rows are linearly dependent, and we must find |/ — A] = 0. This guarantees that 
the system does possess nontrivial solutions; in fact, as indicated in Table 5.1, it has an 
infinite number of them, This means that in a closed model, with a homogencous-cquation 
system, no unique “cerrect” output mix exists. We can determine the output levels 
x}....,« in proportion to one another, but cannot fix their absolute Ievcls unless addi- 
tional restrictions arc imposed on the model. 





EXERCISE 5.7 

1. On the basis of the model in (5.24),-if the final demands are d; = 30,@ = 15, and 
d3 = 10 (all in billions of dollars), what are the solution output levels for the three in- 
dustries? (Round off answers to two decima! places.) 

2. Using the information in ($.23), calculate the total amount of primary input required 
to produce the solution output levels of Prob. 1. 

3. Ina two-industry economy, it is known that industry | uses 10 cents of its awn product 
and 60 cents of commodity H to produce a dollar's worth af commodity |; industry Il 
uses none of its own product but uses 50 cents of commodity | in producing a dollar's 
worth of commodity Il; and the open sector demands $1,000 billion of commodity | 
and $2,000 billion of cornmodity IE. 

(a) Write out the input matrix, the Leontief matrix, and the specific input-output 
matrix equation for this economy. 
(b) Check whether the data in this problem satisfy the Hawkins-Simon condition. 
{o) Find the solution output levels by Cramer's rule. 
4, Given the input matrix and the final-demand vector 
jen 0.25 | ‘29 
A=] 033 0.10 0.12 d=] 200 
0.19 0.38 0 900 
(a) Explain the economic meaning of the elements 0.33, 0, and 200. 
(b) Explain the economic meaning (if any) of the third-column sum. 
{© Explain the economic meaning (if any) of the third-row sum. 
(d) Write out the specific input-output matrix equation for this model. 
(@ Check whether the data given in this problem satisfy the Hawkins-Simon condition. 
5, (a) Given a4 x 4 matrix 8 = [b;;], write out all the principal minors. 
(b) Write out all the leading principal minors. 


6. Show that,. by itself (without other restrictions on matrix B), the Hawkins-Simon condi- 
tion already guarantees the existence of a unique solution vector x*, though not nec- 
essarily nonnegative. 


5.8 Limitations of Static Analysis 





In the discussion of static equilibrium in the market or in the national income, our primary 
concern has been to find the equilibrium values of the endogenous variables in the model. A 
fundamental point that was ignored in such an analysis is the actual process of adjustments 


Chapter 5 Linear Models and Matrix Algebra (Continued) 121 


and readjustments of the variables ultimately leading to the equilibrium state (if it is at all 
attainable). We asked only about where we shall arrive but did not question when or what 
may happen along the way. 

The static type of analysis fails, therefore, to take into account two problems of impor- 
tance. One is that, since the adjustment process may take a long time to compicte, an equi- 
librium state as determined within a particular frame of static analysis may have lost its 
relevance before it is even attained, if the exogenous forces in the model have undergone 
some changes in the meantime. This is the problem of shifts of the equilibrium state. The 
second is that, even if the adjustment process is allowed to run its course undisturbed, the 
equilibrium state envisaged in a static analysis may be altogether unattainable. This would 
be the case of a so-called unstable equilibrium, which is characterized by the fact that the 
adjustment process will drive the variables further away from, rather than progressively 
closer to, that equilibrium state. To disregard the adjustment process, therefore, is to as- 
sume away the problem of attainability of equilibrium. 

The shifts of the equilibrium state (in response to exogenous changes) pertain to a type 
of analysis called comparative statics, and the question of attainability and stability of equi- 
librium falls within the realm of dynamic analysis. Each of these clearly serves to fill a sig- 
nificant gap in the static analysis, and it is thus imperative to inquire into those areas of 
analysis also. We shall leave the study of dynamic analysis to Part 5 of the book and shall 
hext turn our attention to the problem of comparative statics. 





Part 


Comparative-Static 
Analysis 





Chapter 


6.1_ The Nature of Comparative Statics 








Comparative Statics and 
the Concept of Derivative 


This chapter and Chaps, 7 and 8 will be devoted to the methods of comparative-static 
analysis. 





124 


Comparative statics, as the name suggests, is concerned with the comparison of different 
equilibrium states that are associated with different sels of values of parameters and cx- 
ogenous variables. For purposes of such a comparison, we always start by assuming a given 
initial equilibrium state. In the isolated-market model, for cxample, such an initial cqui- 
librium will be represented by a determinate price P* and a corresponding quantity Q*. 
Similarly, in the simple national-income model of (3.23), the initial equilibrium will be 
specified by a determinate Y* and a corresponding C*. Now if we let a disequilibrating 
change occur in the model ‘in the form of a change in the value of some parameter or 
exogenous variable ~the initial equilibrium will, of course, be upset. As a result, the vari- 
ous endogenous variables must undergo certain adjustments. If it is assumed that a new 
equilibrium state relevant to the new values of the data can be defined and attaincd, the 
question posed in the comparative-static analysis is: How would the new cquilibrium com- 
pare with the old? 

It should be noted that in comparative statics we still disregard the process of adjustment 
of the variables; we mercly compare the initial (prechange) equilibrium state with the final 
(postchange) equilibrium state. Also, we still preclude the possibility of instability of cqui- 
librium, for we assuine the new equilibrium to be attainable, just as we do for the old. 

A comparative-static analysis can be either qualitative or quantitative in nature. [f we are 
interested only in the question of, say, whether an increase in investment /y will increase or 
decrease the equilibrium income ¥*, the analysis will be qualitative because the direction 
of change is the only matter considered. But if we are concemed with the magnitude of the 
change in Y* resulting from a given change in /y (that is, the size of the investment mulli- 
plicr), the analysis will obviously be quantitative. By obtaining a quantitative answer, how- 
ever, we can automatically tell the direction of change from its algebraic sign, Hence the 
quantitative analysis always embraces (he qualitative, 





6.2 Rate 


Chapter 6 Comparative Statics and the Concept of Derivative 125 


It should be clear thai the problem under consideration is essentially one of finding a 
rate of change: the tate of change of the equilibrium value of an endogenous variable with 
respect to the change in a particular parameter or exogenous variable. For this reason, the 
mathematical concept of derivative takes on preponderant significance in comparative 
statics, because that concept -the most fundamental one in the branch of mathematics 
known as differential calculus is directly concerned with the notion of rate of change! 
Later on, moreover, we shall find the concept of derivative to be of extreme importance for 
optimization problems as well. 


of Change and the Derivative 





Example 1 


Even though our present context is concerned only with the rates of change of the equilib- 
rium values of the variables in a model, we may carry on the discussion in a more general 
manner by considering the rate of change of any variable y in response to a change in 
another variable x, where the two variables are related to each other by the function: 
y= flay 

Applicd to the comparative-static context, the variable p will represent the equilibrium 
value of an endogenous variable, and x will be some parameter. Note that, for a start, we are 
testricting ourselves to the simple case where there is only a single parameter or exogenous 
variable in the model. Once we have mastered this simplified case, however, the extension 
to the case of more parameters will prove relatively casy. 


The Difference Quotient 
Since the notion of “change” figures prominently in the present context, a special symbol 
is needed to represent it. When the variable x changes from the value x to a new value x), 
the change is measured by the difference x, — xy. Hence, using the symbol A (the Greek 
capital delta, for “difference”) to denote the change, we write Ax = x) — xy. Also needed 
is a way of denoting the value of the function f(x) at various values of x. The standard 
practice is to use the notation /(x;) to represent the value of f{x) when x = x;. Thus, 
for the function f(x)=5+x7, we have (0) =5+407 = 5; and similarly, f(2) = 
542? =9, ete. 

When x changes from an initial value x9 to a new value (xq + Ax), the valuc of the func- 
tion y = f(x) changes from f(x») to f(xy + Ax). The change in y per unit of change in x 
can be represented by the difference quotient. 


ay _ flan tan) ~ fle) 
Ax Ax 


(6.1) 


This quotient, which measures the average rate of change of y, can be calculated if we know 
the initial value of x, or xq, and the magnitude of change in.x, ot Ax. That is, Av/Ax isa 
function of'xo and Ax. 


Given y = f(x) = 3x? — 4, we can write 
f(xo) =30%0" -4 Flo FAN) = (x0 + AN)? -— 4 


126 Part Three Compararive-Static Analysis 


Therefore, the difference quotient is 


Ay — 3%) + Ax? -4- Gx —4) _ bay Ax + 3(Ax)’ 
Ax Ax ~ Ax 
= Oxy +3 Ax (6.2) 





which can be evaluated if we are given xq and Ax. Let xy = 3 and Ax = 4; then the aver- 
age rate of change of y is 6(3) + 3(4) = 30. This means that, on the average, as x changes 
from 3 to 7, the change in y is 30 units per unit change in x. 


The Derivative 
Frequently, we are interested in the rate of change of » when Ax is very small. In such a 
case, it is possible to obtain an approximation of Ay/Ax by dropping all the terms in the 
difference quotient involving the expression Ax. In (6.2), for instance, if Ax is very small, 
we may simply take the term 6xp on the right as an apptoximation of Ay/Ax. The smaller 
the value of Ax, of course, the closer is the approximation to the true value of Ay/Ax. 
As Ax approaches zero (meaning that it gets closer and closer to, but never actually 
reaches, zero), (6x9 + 3 Ax) will approach the value 60, and by the same token, Ay/Ax 
will approach 6xy also. Symbolically, this fact is expressed eithcr by the statement 
Ay/Ax + 6x0 as Ax — 0, or by the equation 


_. Ay . 
tim, un sim (6x0 + 38) = 6xq (6.3) 


where the symbol Jim, is read as “The limit of ... as Ax approaches 01.” If, as Ax -+ 9. 
os 


the limit of the difference quotient Ay /Ax indeed exists, that limit is called the derivative 
of the function y = f(x). 

Several points should be noted about the derivative if it exists. First, a derivative isa 
function; in fact, in this usage the word derivative really means a derived function. The 
original function y = f(x) is a primitive function, and the derivative is another function 
derived from it, Whereas the difference quotient is a function of xy and Ax, you should 
observe—from (6.3), for instanee- that the derivative is a function of x9 only. This is 
because Ax is already compelled to approach zero, and therefore it should not be regarded 
as another variable in the function. Let us also add that so far we have used the subscripted 
symbol xy only in order to stress the fact that a change in x must start ftom some specific 
value of x. Now that this is understood, we may delete the subscript and simply state that 
the derivative, like the primitive function, is itself a function of the independent variable x. 
That is, for each value of x, there is a unique corresponding valuc for the derivative 
function. 

Second, since the derivative is merely a limit of the difference quotient, which measures 
a tate of change of y, the derivative must of necessity also be a measure of some rate of 
change. In view of the fact that the change in x envisaged in the derivative concept is infin- 
itesimal (that is, Ax — 0), the rate measured by the derivative is in the nature of an 
instantaneous rate of change. 

Third, there is the matter of notation. Derivative functions are commonly denoted in two 
ways. Given a primitive function y = f(x), one way of denoting its derivative (if it exists) 
is to use the symbol f’(x), or simply /’; this notation is attributed to the mathematician 


Example 2 


Chapter 6 Comparative Statics and the Concept of Derivarive 27 


Lagrange. The other common notation is dv/dx, devised by the mathematician Leibniz. 
[Actually there is a third notation, Dp, or Df(x), but we shall not use it in the following 
discussion.] The notation /’(x), which resembles the notation for the primitive function 
F(x), has the advantage of conveying the idea that the derivative is itself’ a function of x. 
The reason for expressing it as f’(x)—rather than, say, 6(x}—is to emphasize that the 
function 7" is derived from the primitive function f£ The alternative notation, dy/dx, serves 
instead to emphasize that the value of a derivative measures a rate of change. The letter ¢ is 
the counterpart of the Greek A, and dv/dx differs from Ay/Ax chiefly in that the former is 
the limit of the latter as Ax approaches zero. In the subsequent discussion, we shall use 
both of these notations, depending on which seems the more convenient in a particular 
context. 

Using these two notations, we may define the derivative of a given function p = {(x) as 
follows: 

dy ) Ay 


def = im ae 


Referring to the function y = 3x? — 4 again, we have shown its difference quotient to be 
(6.2), and the limit of that quotient to be (6.3). On the basis of the latter, we may now write 
(replacing x with x): 


oy =6x or F'tx) = 6x 

dx 
Note that different values of x will give the derivative correspondingly different values. For 
instance, when x= 3, we find, by substituting x=3 in the f’(x) expression, that 
F(3) = 6(3) = 18; similarly, when x = 4, we have f'(4) = 6(4) = 24. Thus, whereas f'{x) 
denotes a derivative function, the expressions f/(3) and f'(4) each represents a specific 
derivative value, 





EXERCISE 6.2 


1. Given the function y = 4x* +9: 
(@) Find the difference quotient as a function of x and Ax. (Use xin lieu of xo.) 
(b) Find the derivative dy/dx. 
( Find £'(3) and f'(4). 
2, Given the function y = 5x? — 4x: 
(a) Find the difference quetient as a function of x and Ax. 
(6) Find the derivative dy/dx. 
(0 Find (2) and F(3). 
3. Given the function y = 5x — 2: 
(a) Find the difference quotient Ay/Ax, What type of function is it? 
(b) Since the expression Ax does not appear in the function 4y/Ax in part (a), does it 
make. any difference to the value:of Ay/Ax whether Ax is large or small? Conse- 
quently, what is the:limit of the difference quotient:as Ax approaches zero? 


128 Part Three Comparative-Static Analysis 


6.3__ The Derivative and the Slope of a Curve 





FIGURE 6.1 





Elementary economics tells us that, given a total-cost function C = f(Q), where C de- 
notes total cost and Q the output, the marginal cost (MC) is defined as the change in total 
cost resulting from a unit increase in output; that is, MC = AC/AQ. It is understood that 
AQ is an extremely small change. For the case of a product that has discrete units (integets 
oniy), a change of one unit is the smallest change possible; but for the case of a product 
whose quantity is a continuous variable, AQ can refer to an infinitesimal change. In this 
latter case, it is well known that the marginal cost can be measured by the slope of the total- 
cost curve, But the slope of the total-cost curve is nothing but the limit of the ratio 
AC/AQ, when AQ approaches zero. Thus the concept of the slope of a curve is merely 
the geometric counterpart of the concept of the derivative. Both have to do with the 
“marginal” notion so extensively used in economics. 

In Fig, 6.1, we have drawn a total-cost curve C, which is the graph of the (primitive) 
function C = /(Q). Suppose that we consider Q as the initial output level from which an 
increasc in output is measured; then the relevant point on the cost curve is the point 4, If 
output is to be raised to Qo -+ AQ = Qz, the total cost will be increased from Cy to 
Co + AC = Cy; thus AC/AQ = (C2 — Cy)/(Q2 — Qo). Geometrically, this is the ratio 
of two line segments, #8/ AE, or the slope of the line AB. This particular ratio measures an 
average tate of change—the average marginal cost for the particular AQ pictured and 
represents a difference quotient. As such, it is a function of the initial value Qy and the 
amount of change AQ. 

What happens when we vary the magnitude of AQ? Lf a smaller output increment is 
contemplated (say, from Qp to Q only), then the average marginal cost will be measured 
by the slope of the line 4D instead. Moreover, as we reduce the output increment further 
and further, flatter and flatter lines will result until, in the limit (as AQ — 0), we obtain the 
line KG (which is the sangent ling to the cast curve at point.A) as the relevant line. The slope 


c=fQ) 








Chapter 6 Comparative Staties and the Concept of Derivative 129 


of KG (= HG/K 4) measures the slope of the total-cost curve at point 4 and represents 
the limit of AC/AQ, as AQ — 0, when initial output is at @ = Qy. Therefore, in terms 
of the derivative, the slope of the C = #(Q) curve at point A corresponds to the particular 
derivative value /’(Qo). 

What if the initial output level is changed from Qo to, say, Q2? In that case. point 8 on 
the curve will replace point A as the relevant point, and the slope of the curve at the new 
point 8 will give us the derivative value /"(Q2). Analogous results are obtainable for alter- 
native initial output levels. In general, the derivative {“(@) -a function of Q—will vary as 
Q changes. 


6.4 The Concept of Limit 


The derivative dy/dx has been defined as the limit of the difference quotient Av/Ax as 
Ax - 0, If we adopt the shorthand symbols g = Av/Ax (g for quotient) and v = Ar 
{v for variation in the value of.x), we have 

dy 


v . Ay . 
== lim — = limg 
dx arso Ay ps0 





In view of the fact that the derivative concept relies heavily on the notion of limit, it is im- 
perative that we get a clear idea about that notion. 


Left-Side Limit and Right-Side Limit 

The concept of limit is concerned with the question: “What value does onc variable (say, 7) 
approach as another variable (say, v) approaches a specific value {say, zero)?” In order for 
this question to make sense, g must, of course, be a function of v; say, g = g(n). Our 
immediate interest is in finding the limit of g as v + 0, but we may just as casily explore 
the more general case of v > N, where N is any finite real number. Then, Tim ¢ will be 








=) 
merely a special case of jim .g¢ where N = 0. In the course of the discussion, we shall 


actually also consider the limit of q as v > +00 (plus infinity) or as v  —90 (minus 
infinity). 

When we say v — WN, the variable » can approach the number N cither from values 
preater than N, or from values less than 4. If, as v > N ftom the left side (from values less 
than 1}, g approaches a finite number L, we call L the left-side limit of g. On the other hand, 
if L is the number that g tends to. as > N from the right side (from values greater than 4’), 
we call L the right-side limit of ¢. The left- and right-side limits may or may not be equal. 

The left-side limit of g is symbolized by | Jim q (the minus sign signifies from values 
less than \), and the right-side limit is written as him a When—and only when—the two 


limits have a common finite value (say, Z), we consider the limit of g to exist and write il as 
tim q = L. Note that L must be a finite number. If we have the situation of Jon ig = 

(or. —oo), we shall consider g to possess xa limit, because Jim, /q = oO means thaty +0 

as v > N, and ifg will assume ever-increasing values as y ends to N, it would be contra- 

dictory to say that g has a limit. As a convenient way of expressing the fact thal g > 00 as 

u — N, however, some people do indeed write iim g¢ =o and speak of g as having an 
“infinite limit.” 


130) Part Three Comparative-Static Analysis 


FIGURE 6,2 


In certain cases, only the limit of one side needs to be considered, In taking the limit of 
gas v > +00, for instance, only the left-side limit of g is relevant, because v can approach 
+00 only from the left. Similarly, for the case of y + —o0, only the right-side limit is 
relevant. Whether the limit of g exists in these cases will depend only on whether ¢ 
approaches a finite value as v > +00, of as v + —o0. 

It is important to realize that the symbol oo (infinity) is not a number, and therefore it 
cannot be subjected to the usual algebraic operations. We cannot have 3 + 90 or 1/90, nor 
can we write g = 90, which is not the same as g > o¢. However, it is acceptable to express 
the fimit of y as “=” (as against >) oo, for this merely indicates that g > oo. 


Graphical Illustrations 
Let us illustrate, in Fig. 6.2, several possible situations regarding the limit of a function 
q = gle). 

Figure 6.24 shows a smooth curve. As the variable v tends to the value V from either 
side on the horizontal axis, the variable g tends to the value L. In this case, the left-side limit 
is identical with the right-side limit; therefore we can write Jim, qa. 


q 


= 















Beo---- =e 





(a) 














Example 1 


Chapter 6 Comparutive Statics and the Concept of Derivative 131 


The curve drawn in Fig. 6.26 is not smooth; it has a sharp turning point directly above 
the point NV, Nevertheless, as v tends to N from either side, g again lends to an identical 
value Z. The limit of g again exists and is equal ta 4, 

Figure 6.2c shows what is known as a step function.’ In this case, as v tends to N, the 
left-side limit of g is £.,, but the right-side limit is £2, a different number. Hence, q does not 
have alimitas v = N, 

Lastly, in Fig. 6.2d, as v tends to N, the left-side limit of g is —oo, whereas the right-side 
limit is toc, because the two parts of the (hyperbolic) curve will fall and rise indefinitely 
while approaching the broken vertical line as an asymptote. Again, kim q does not exist. 





On the other hand, if we are considering a different sort of limit i in diagram d, namely, 
tim, q, then only the left-side limit has relevance, and we do find that limit to exist: 


im q = M. Analogously, you ean verily that Jim, g = M as well. 


toe 

Tt is also possible to apply the concepts of left ‘side: and right-side limits to the discussion 
of the marginal cost in Fig. 6.1. In that context, the variables q and v will refer, respectively, 
to the quotient AC/AQ and to the magnitude of AQ, with all changes being measured 
from point 4 on the curve. In other words, q will refer to the slope of such lines as 4B, AD, 
and KG, whereas v will refer to the length of such lines as QyQ2 (= line AZ) and 
QoQi (= line AF), We have already seen that, as v approaches zero from a positive value, 
q will approach a value equal to the slope of line KG. Similarly, we can establish that, if 
AQ approaches zero from a negative yalue (Le., as the decrease in output becomes less and 
less), the quotient AC/AQ, as measured by the slope of such lines as Rd (not drawn), will 
also approach a valuc equal to the slope of line KG. Indecd, the situation here is very much 
akin to that illustrated in Fig. 6.2a. Thus the slope of XG in Fig. 6.1 (the counterpart of £ in 
Fig. 6.2) is indeed the limit of the quotient g as v lends to zero, and as such it gives us the 
marginal cost at the output level Q = Qo. 


Evaluation of a Limit 
Let us now illustrate the algebraic evaluation of a limit of a given function g = g{v). 


Given g = 2 t v7, find tim q. To take the left-side limit, we substitute the series of negative 


values -1, -75, -h . “én that order) for v and find that (2 + v2) will decrease steadily 
and approach 2 (because v? will gradually approach 0). Next, for the right-side limit, we 
substitute the series of positive values 1, 7 Tw To ... (in that order) for y and find the same 
limit as before. Inasmuch as the two limits are identical, we consider the limit of q to exist 
and write lim 4 =2, 

vo 


t This name is easily explained by the shape of the curve. But step functions can be expressed 
algebraically, too. The one illustrated in Fig. 6.2c can be expressed by the equation 


Li (fordzv<N) 
Lp (forNey 


Note that, in each subset of its domain as described, the function appears as a distinct constant 
function, which constitutes a “step” in the graph. 

In economics, step functions can be used, for instance, to show the various prices charged for 
different quantities purchased (the curve shown in Fig. 6.2c pictures quantity discount) or the various 
tax rates applicable to different income brackets. 


132 Part Three Comparative-Static Analvsis 


Example 2 


Example 3 


It is tempting to regard the answer obtained in Example | as the outcome of setting 

v = 0 inthe equation g = 2 + v”, but this temptation should in general be resisted. In eval- 

uating lim ¢ . we only let v tend to N, but, as a rule, do not let v = N. Indeed, we can quite 
— 


legitimately speak of the limit of gas u > N, even if Nis nof in the domain of the function 
q = g(v). In this latter case, if we try to set v = N, g will clearly be undefined. 


Given q = (1 — v7)/(1 — v), find lim gq. Here, N = 1 is not in the domain of the function, 
and we cannot set v= 1 because that would involve division by zero. Moreover, even the 
limit-evaluation procedure of letting v— 1, as used in Example 1, will cause difficulty, for 
the denominator (1 — v) will approach zero when v > 1, and we will still have no way of 
performing the division in the limit. 

One way out of this difficulty is to try to transform the given ratio to a form in which v 
will not appear in the denominator. Since v— 1 implies that v4 1, so that (1 - ») is 
nonzero, it is legitimate to divide the expression (1 — v) by (1 —v), and write! 





In this new expression for g, there is no longer a denominator with vin it, Since (1 + ¥) > 2 
as v— 1 from either side, we may then conclude that lim q=2. 
ras 


Given q = (2v + 5)/(v + 1), find lim, g. The variable vy again appears in both the numerator 
and the denominator. If we let v > -Fo00 in both, the result will be a ratio between two infi- 
nitely large numbers, which does not have a clear meaning. To get out of the difficulty, we 
try this time to transform the given ratio to a form in which the variable v will not appear in 
the numerator This, again, can be accomplished by dividing out the given ratio, Since 
(2v+-5) is not evenly divisible by (v + 1), however, the result will contain a remainder term 
as follows: 
2vt+5 _ 3 
ove ved 

But, at any rate, this new expression for g no longer has a numerator with y in it. Noting 
that the remainder 3/(v-+ 1) + 0 as v > +00, we can then conclude that lim, 4 =2. 


There also exist several useful thcorems on the evaluation of limits. These will be 
discussed in Sec. 6.6. 


* The division can be performed, as in the case of numbers, in the following manner: 








Alternatively, we may resort to factoring as follows: 
1-2 G+tyd-y 
Ta I 
* Note that, unlike the v > 0 case, where we want to take y out of the denominator in order to 
avoid division by zero, the v + oc case is better served by taking v out of the numerator. As v -> 0, 
an expression containing ¥ in the numerater will become infinite but an expression with v in the 
denominator will, more conveniently for us, approach zero and quietly vanish from the scene. 


=l+v (v1) 


Chapter 6 Comparative Statics and the Concept of Derivative 133 


Formal View of the Limit Concept 

The previous discussion should have conveyed some general ideas about the limit concept. 
Let us now give it a more precise definition. Since such a definition will make use of the 
concept of neighborhood of a point ona line (in particular, a specific number as a point on 
the line of real numbers), we shall first explain the latter term. 

For a given number £, there can always be found a number (Z — a;) < L and another 
number (Z +42) > £, whete a, and a) are some arbitrary positive numbers. The set of 
all numbers falling between (£ — a1) and (L + az) is called the interval between those two 
numbers, If the numbers (2 — a) and (Z + aa) are included in the set, the sel is a closed 
iuerval; if they are excluded, the set is an apen interval. A closed interval between 
(L —a)) and (L + az) is denoted by the bracketed expression 





[L-a,L+a] = (¢|L-a eq slt+e} 
and the corresponding open interval is denoted with parentheses: 
(L—a, L +a) = (q |L—-ap<q <L+ay) (6.4) 


‘Thus, [ ] relate to the weak inequality sign <, whereas (_) relate to the strict inequality sign 
<. But in both types of intervals, the smaller number (Z — a;) is always listed first. Later 
on, we shail also have occasion to refer to half-open and half-closed intervals such as (3, 5] 
and [6, 00), which have the folowing meanings: 


G5) = fx |3 <x < 5} [6, 00) = {x |6<x < co} 


Now we may define a neighborhood of L to be an open interval as defined in (6.4), 
which is an interval “covering” the number L." Depending on the magnitudes of the arbi- 
trary numbers a, and a, it is possible to construct various neighborhoods for the given 
number /,, Using the concept of neighborhood, the limit of a function may then be defined 
as follows: 


AS t approaches a number N, the limit of g = g(v) is the number Z, if, for every 
neighborhood of £ that can be chosen, However smail, there can be found a corresponding 
neighborhood of N (excluding the point v = N) in the domain of the function such that, for 
every value of v in that N-neighbarhood, its image lies in the chosen L-ncighborhood. 


This statement can be clarified with the help of Fig, 6.3, which resembles Fig. 6.2a. 
From what was learned about Fig. 6.2@, we know that lim q = /. in Fig. 6.3. Let us show 


yon 


that L does indced fulfill the new definition of a limit. As the first step, sclect an arbitrary 
small neighborhood of L, say, (L — a, L +a). (This should have been made even 
smaller, but we are keeping it relatively large to facilitate exposition.) Now construct a 
neighborhood of N, say, (N —b1, N + 62), such that the two neighborhoods (when ex- 
tended into quadrant [) will together define a rectangle (shaded in diagram) with two of its 
corners lying on the given curve. It can then be verified that, for every value of v in this 
neighborhood of N (not counting v = 4), the corresponding value of g = x({v) lies in the 


‘The identification of an open interval as the neighborhood of a point is valid only when we 
are considering a point on a line (one-dimensional space). In the case of a point in a plane 
(two-dimensional space), its neighborhood must be thought of as an area, say, a circular area 
that includes the point, 


134 Part Three Comparative-Static Analysis 


FIGURE 6.3 





g= gv) 








chosen neighborhood of L. In fact, no matter how smaél an L-neighborhood we choose, a 
(correspondingly small) N-neighborhood can be found with the property just cited. Thus /. 
fulfills the definition of a limit, as was to be demonstrated. 

We can also apply the given definition to the step function of Fig. 6.2¢ in order to show 
that neither £; nor £2 qualifies as dim q. If we choose a very small neighborhood of £;— 


say, just a hair’s width on each side of L,—then, no matter what neighborhood we pick for 
N, the rectangle associated with the two neighborhoods cannot possibly enclose the lower 
step of the function. Consequently, for any value of v > N, the corresponding value of ¢ 
(located on the lower step) will not be in the neighborhood of L,, and thus £; fails the test 
for a limit. By similar reasoning, Lz must also be dismissed as a candidate for Jim ig. In 
fact, in this case no limit exists for g asv > N. 

The fulfillment of the definition can also be checked algebraically rather than by graph. 
For instance, consider again the function 





2 


~=1+be wel (6.5) 








it has been found in Example 2 that lim q = 2; thus, here we have N = | and L = 2. To 
verify that L = 2 is indeed the limit of q, we must demonstrate that, for every chosen 
neighborhood of L, (2 — a), 2+ aq). there exists a neighborhood of N, (1 — 41, 1 + £2), 
such that, whenever v is in this neighborhood of N, q must be in the chosen neighborhood 
of L. This means essentially that, for given values of a, and a2, however small, two num- 
bers , and 62 must be found such that, whenever the inequality 


t-h <v<lth Ql) (6.6) 
is satisfied, another inequality of the form 


2-ay<q<2t+a (6.7) 


Chapter 6 Comparative Stativs and the Canvept of Derivative 135 


must also be satisfied. To find such a pair of numbers 4; and by, let us first rewrite (6.7) by 


substituting (6.5): 
2-a <l+u<2+a (6.7’) 
This, in turn, can be transformed (by subtracting | from each sidc) into the inequality 
l-a<u<l+a) (6.7) 


A comparison of (6.7")—a variant of (6.7) with (6.6) suggests that if we choose the two 
numbers 6; and hy to be 6) =a, and 6; = ap, the two inequalitics (6.6) and (6.7) will 
always be satisfied simultaneously. Thus the neighborhood of N, (1 — by, | +4), as 
required in the definition of a limit, can indeed be found for the case of £ = 2. and this 
establishes L = 2 as the limit, 

Let us now utilize the definition of a limit in the opposite way, to show that another value 
(say, 3) camnot qualify as lim gq for the function in (6.5). H!3 were that limit, it would have 
to be true that, for every chosen neighborheod of 3, (3 — a), 3 + a3), there exists a neigh- 
borhood of 1, (1 — 6), | + 62), such that, whenever v is in the latter neighborhood, g must 
be in the former neighborhood, That is, whenever the inequality 


l-h su<l+hy 
is satisfied, another inequality of the form 
3-a, <l+u<34+a 
or 2-a,<ve2+tar 


thust also be satisfied. The only way to achieve this resuit is to choose hy = ay -- | and 
by = a2 + 1, This would imply that the neighborhood of | is to be the open interval 
(2 — a, 2+). According to the definition of a limit. however, a) and a can be made 
arbitrarily small, say, ¢) = a2 = 0.1. In that case, the last-mentioned interval will turn out 
ty be (1,9, 2.1) which lies entirely to the right of the point v = 1 on the horizontal axis and, 
hence, does not even qualify as a neighborhood of 1. Thus the definition of a limit cannot 
be satisfied by the number 3. A similar procedure can be employed to show that aay num- 
ber other than 2 will contradict the definition of a limit in the present case. 

In general, if one number satisfies the definition of a limit of g as v > N, then no other 
number can. Ifa limit exists, it is unique. 





EXERCISE 6.4 


1, Given the function ¢ = (v2 + v—56)/(v— 7), (vs 7), find the left-side limit and the 
right-side limit of g as v approaches 7. Can we conclude from these answers that g has 
a limit as v approaches 7? 

2. Given g = [(v + 2)3 — 8}/v, (v# 0), find: 


(@) lim g ( ling (©) lim 4 
3. Given g = 5-1/¥, (v4), find: 
9 9 © 9 


4, Use Fig. 6.3 to show that we cannot consider the number (£ + dy) as the limit of gas v 
tends to N. 


136 Part Three Comparative-Static Analysis 


6.5 Digression on Inequalities and Absolute Values 





We have encountered inequality signs many times before. In the discussion of Sec. 6.4, we 
also applied mathematical operations to inequalities. In transforming (6.7) into (6.7"), for 
example, we subtracted 1 from each side of the inequality. What rules of operations are 


generally applicable to inequalities {as opposed to equations)? 


Rules of Inequalities 


To begin with, let us state an important property of inequalities: inequalities are transitive. 
This means that, if @ > 4 and if b > ¢, then a > c. Since equalities (equations) are also 
transitive, the transitivity property should apply to “weak” inequalities (= or <) as well as 


to “strict” ones (> or <), Thus we have 
a@>bhb>cstar>e 


azbh>ec>aze 


This property is what makes possible the writing of a continued inequality, such as 
3<a<b<8or7 <x < 24, (In writing a continued incquality, the inequality signs are 
as a rule arranged in the same direction, usually with the smallest number on the left.) 

The most important rules of inequalities are those governing the addition (subtraction) 
of a number to (from) an inequality, the multiplication or division of an inequality by a 
number, and the squaring of an inequality. Specifically, these rules are as follows. 


Rule! (addition and subtraction) a > b> atk> bok 


An inequality will continue to hold if an equal quantity is added to or subtracted from each 
side, This rule may be generalized thus: Ifa > b>c,thenatk>bth>e+k. 


Rule I (multiplication and division) 


ka>kh — (k> 0} 


arb ka < kb (k <0) 


The multiplication of both sides by a positive number preserves the inequality, but a nega- 
tive multiplier will cause the sense (or direction) of the inequality to be reversed. 


Example 1 Since 6 > 5, multiplication by 3 will yield 3(6) > 3(5), or 18 > 15; but multiplication by —3 
——*—— _ will result in (—3)6 < (—3)5, or -18 < -15. 
Division of an inequality by a number # is cquivalent to multiplication by the number 
1 /n; therefore the rule on division is subsumed under the rule on multiplication. 
Rule IIL (squaring) a>b(b=)sa>h 
If its two sides are both nonnegative, the incquality will continue to hold when both sides 
are squared. 
Example 2 Since 4 > 3 and since both sides are positive, we have 4? » 32, or 16 > 9. Similarly, since 


2> 0, it follows that 22 > 02, or 4 > 0. 


Rules | through TI have been stated in terms of strict inequalities, but their validity is 


unaffected if the > signs are replaced by > signs. 


Chapter 6 Comparative Statics and the Concept of Derivative 137 


Absolute Values and Inequalities 

When the domain of a variable x is an open interval (a, 6), the domain may be denoted by 
the set {x | < x < 4} or, more simply, by the inequality a < x < b, Similarly, if itis a 
closed interval [a, b], it may be expressed by the weak incquality a < x < b. In the special 
case of an interval of the form (—@, @)—say, (—10, 10) it may be represented either by 
the inequality —10 < x < 10 or, alternatively, by the inequality 


|x] < 10 


where the symbol || denotes the absolute value (or numerical value) of x. 
For any real number #, the absolute value of 7 is defined as follows:* 


n (ifn > 0) 
=n (ifa <0) (6.8) 
0 (ifa =) 


Note that, if a = 15, then |15| = 15; but ifn” = —15, we find 
|-15] = —(~15) =15 





also. In effect, therefore, the absalute value of any real number is simply its numerical value 
after the sign is removed. For this reason, we always have || = |—”|. The absolute value 
of n is also called the modulus of n, 

Given the expression |x| = 10, we may conclude from (6.8) that x must be either 
10 or —10. By the same token, the expression [x| < 10 means that (1) if x > 0, then 
x = |x| < 10, so that x must be less than 10; but also (2) ifx < 0, then according to (6.8) 
we have —x = |x| < 10, or x > —16, so that x must be greater than —10. Hence, by com- 
bining the two parts of this result, we see that.x must lie within the open interval (—10, 10). 
In general, we can write 


lnl<ag—meaxan (a> 0) (6.9) 





which can also be extended to weak inequalities as follows: 


jx] Sno —n<xn (1 > 0) (6.10) 





ise they are themselves numbers, the absolute values of two numbers m7 and i 
can be added, subtracted, multiplied, and divided. The following properties characterize 
absolute values: 
lan] + a] > |e + 7] 
lin] [al = [me -n| 
bal 


|n| 


_ | 
~ n 
The first of these, interestingly, involves an inequality rather than an equation. The reason 
for this is easily seen: whereas the left-hand expression | + [n| is definitely a sean of two 


* We caution again that, although the absolute-value notation is similar to that of a first-order 
determinant, these two concepts are entirely different. The definition of a first-order determinant is 
la;;| = aij, regardless of the sign of a;;. In the definition of the absolute value |i, on the other hand, 
the sign of 7 will make a difference. The context of the discussion should normally make it clear 
whether an absolute value or a first-order determinant is under consideration. 


138 Part Three Comparative-Static Analysis 


Example 3 


Example 4 


Example 5 


Example 6 


numerical values (both taken as positive), the expression |m + | is the numerical value of 
either a sum (if m and # are, say, both positive) or a difference (if and # have opposite 
signs). Thus the left side may exceed the right side. 


m= 5 and n= 3, then |m| + |n| = |m+ n| =8. But if m=5 and n= -3, then [mj + [n= 
5+3 = 8, whereas 


|m+nl=|5-3)/=2 


is a smaller number. 


In the other two properties, on the other hand, it makes no difference whether m and n 
have identical or opposite signs, since, in taking the absolute value of the product or 
quotient on the right-hand side, the sign of the latter term will be removed in any case. 


If m=7 and n=8, then |m|-|n| =|m- nl = 7(8) = 56. But even if m=—7 and n=8 
(opposite signs), we still get the same result from 
[en] + |] = |—7}- 18] = 708) = 56 
and \m- n| = |—7{8)| = 7(8) = 56 


Solution of an Inequality 

Like an equation, an inequality containing a variable (say, x) may have a solution; the solu- 
tion, if it exists, is a set of values of x which make the inequality a true statement. Such a 
solution will itself usually be in the form of an inequality. 


Find the solution of the inequality 
3x-3>x+1 


As in solving an equation, the variable terms should first be collected on one side of the 
inequality. By adding (3 — x} to both sides, we obtain 


3x-343--X>x+14+3-+K 
or 2x>4 


Multiplying both sides by 4 (which does not reverse the sense of the inequality, because 
} > 0) will then yield the solution 


x>2 


which is itself an inequality. This solution is not a single number, but a set of numbers. 
Therefore we may also express the solution as the set {x |x > 2} or as the open interval 
(2, 0). 


Salve the inequality |1 — x| < 3. First, let us get rid of the absolute-value notation by utiliz- 
ing (6.10). The given inequality is equivalent to the statement that 

—3<1+x<3 
or, after subtracting 1 from each side, 


-4<-x<e2 


Chapter 6 Comparative Stities and the Concept of Derivative 139 


Multiplying each side by (~1), we then get 
4>x>-2 
where the sense of inequality has been duly reversed. Writing the smaller number first, we 
may express the salution in the form of the inequality 
-2axs4 
or in the form of the set {x | —2 < x = 4} or the closed interval [—2, 4]. 
Sometimes, a problem may call for the satisfaction of several inequalities in several vari- 


ables simultaneously; then we must solve a system of simultaneous inequalities. This prob- 
lem arises, for example, in nonlinear programming, which will be discussed in Chap. 13. 





EXERCISE 6.5 


1. Solve the following inequalities: 
(a) 3x-1 < 7x42 ( 5x41 2x43 
(b) 2x4+5<x-4 (d) 2x-1<6x+5 


2. If 8x — 3 < 0 and 8x > O, express these in a continued inequality and find its solution. 
3. Solve the following: 
(@) Ix+11 <6 (6) |4- 3x] <2 © lex+3) 55 


6.6 Limit Theorems 





Example 1 


Our interest in rates of change led us to the consideration of the concept of derivative, 
which, being in the nature of the limit of a difference quotient, in turn prompted us to study 
questions of the existence and evaluation of a limit. The basic process of limit evaluation, 
as iJlustrated in Sec. 6.4, involves letting the variable v approach a particular number 
(say, NV) and observing the value that g approaches. When actually evaluating the limit of a 
function, however, we may draw upon certain established limit theorems, which can mate- 
tially simptify the task, especially for complicated functions. 


Theorems Involving a Single Function 
When a single function g = g(v) is involved, the following theorems are applicable. 


Theorem! Ifg =av +4, then lim q = aN + 6 {a and 6 are constants). 


ton 


Given q = 5v+ 7, we have lim 4 = 5(2)+7 = 17. Similarly, lim @ = 5(0)4+7=7. 
v> ves 


Theorem Hl If gy = g(v) = 4, then Jim ¢ =). 
This theorem, which says that the limit of a constant function is the constant in that fune- 
tion, is merely a special case of Theorem I, with a = 0. (You have already encountered an 
example of this case in Exercise 6.2-3.) 
Theorem IIL Ifg =v, then lim g=N, 

Haat vthen iim g= we 


140 Part Three Comparative-Static Analysis 


Example 2 Siveng =v", we have lim g =(2)° = 8. 


Example 3 


You may have noted that, in Theorems I through Il, what is done to find the limit of q 
as v—> N is indecd to let v = N. But these are special cases, and they do not vitiate the 
general rule that “v — N” does not mean “v = NV.” 


Theorems Involving Two Functions 
If we have two functions of the same independent variable v, g, = g(v) and q2 = A(v), and 
if both functions possess limits as follows: 


lim gq, =L li =L 

n qi 1 4 my G2 2 
where L, and £2 are two finite numbers, the following theorems are applicable. 
Theorem IV (sum-difference limit theorem) 


lim (a zg)=hi tle 


The limit of a sum (difference) of two functions is the sum (difference) of their respective 
limits. 

In particular, we note that 

lim, Iq= lima ta)ehythi =2L; 
ca eN 
which is in linc with Theorem I. 
Theorem V (product limit theorem) 
Tim (gig2) = £122 
vn 

The limit of'a product of two functions is the product of their limits. 

Applied to the square of a function, this gives 

Tim (gigi) = Efi = Li 
which is in tine with Theorem III. 
Theorem VI (quotient limit theorem) 
2 gi by 
lim — = — Li#9 
ath ero 
The Limit of a quotient of two functions is the quotient of their limits. Naturally, the limit 
La is restricted to be nonzero; otherwise the quotient is undefined. 
Find lim(1 +¥)/(2+¥). Since we have here lima +¥)=1 and lim (2 +¥) = 2, the desired 
vo vo oH 

limit is 5. 

Remember that L, and L2 represent finite numbers; otherwise these theorems do not 


apply. In the case of Theorem VI, furthermore, £2 must be nonzero as well. If these re- 
strictions are not satisfied, we must fall back on the method of limit evaluation illustrated 


Chapter 6 Comparative Statics and the Concept of Derivative 144 


in Examples 2 and 3 in See. 6.4, which relate to the cases, respectively, of L2 being zero 
aad of Lo being infinite. 
Limit of a Polynomial Function 


With the given limit theorems at our disposal, we can casily evaluate the limit of any poly- 
nomial function 


¥ = pv) = ay + aye tage’ +--+ av" {6.11) 
as v tends to the number N, Since the limits of the separate terms are, respectively, 
lim ay = do lim ayy = ayN lim av? = aN? (ete.) 
vN oN aN 


the limit of the polynomial function is (by the sum limit theorem) 
Fim, q = ay + aN + aN? +--+ ba, N" (6.12) 
ra 
This limit is also, we note, actually equal to g(N), that is, equal to the value of the function 


in (6.11) when v = N. This particular result will prove important in discussing the concept 
of continuity of the palynomial function. 





EXERCISE 6.6 
1. Find the limits of the function g = 7 - 9v + v?: 
(a) Asv>0 (b) As v= 3 (Q Asv—> -1 
2. Find the limits of g = (v + 2)(v — 3): 
(a) Asv—> -1 (b) Asv30 @ AsvoS 
3. Find the limits of q = 3v + 5)/(v+ 2): 
(a) Asv> 0 {b) Asv-> 5 (Q Asv—> 1 


6,7 Continuity and Differentiability of a Function 





The preceding discussion of the concept of limit and its evaluation can now be used to 
define the continuity and differentiability of a function, These notions bear directly on the 
derivative of the function, which is what interests us. 


Continuity of a Function 

When a function g = g(v) possesses a limit as v tends to the point N in the domain, and 
when this limit is also equal to g(’}—that is, equal to the value of the function atv = N¥— 
the function is said to be continuous at N. As defined here, the term continuity involves no 
less than three requirements: (1) the point A must be in the domain of the function: i.¢., 
&(N) is defined; (2) the function must have a limit as vy > NW: i lim g(v) exists, and 
(3) that limit must be equal in value to g(N); i.€., him av) = oN)” ms 








it is important to note that while the point (N. L) was excluded from consideration in 
discussing the limit of the curve in Fig. 6.3, we are no longer excluding it in the present 
context, Rather, as the third requirement specifically states, the point (WV, /,) must be on the 
graph of the function before the function can be considered as continuous at point NW, 


142) Part Three Comparative-Statie Analysis 


Example 1 


Let us check whether the functions shown in Fig. 6.2 are continuous. In diagram a, all 
three requirements are met at point NV. Point NV is in the domain; q has the limit L asu > N; 
and the limit £ happens also to be the value of the function at N. Thus, the function repre- 
sented by that curve is continuous at NV. The same is true of the function depicted in 
Fig. 6.2é, since Z is the limit of the function as v approaches the value V in the domain, and 
since L is also the value of the function at NV. This last graphic example should suffice to es- 
tablish that the continuity of a function at point V does not necessarily imply that the graph 
of the function is “smooth” at v = N, for the point (N, Z) in Fig. 6.26 is actually a “sharp” 
point and yet the function is continuous at that value of v. 

When a function ¢ = gv) is continuous at all values of v in the interval (a, 6), it is said 
to be continuous in that interval. If the function is continuous at all points in a subset $ of 
the domain (where the subset S may be the union of several disjoint intervals), it is said to 
be continuous in $. And, finally, if the function is continuous at all points in its domain, we 
say that it is continuous in its domain, Even in this latter case, however, the graph of the 
function may nevertheless show a discontinuity (a gap) at some value of v. say, at v = 5, if 
that value of v is nor in its domain. 

Again referring to Fig. 6.2, we see that in diagram c the function is discontinuous al N 
because a limit does not exist at that point, in violation of the second requirement of conti- 
nuity. Nevertheless, the function does sutisfy the requirements of continuity in the interval 
(0, N) of the domain, as well as in the interval [V, 00), Diagram d obviously is also dis- 
continuous at v = NV. This time, discontinuity emanates from the fact that V is excluded 
from the domain, in violation of the first requirement of continuity. 

On the basis of the graphs in Fig, 6.2, il appears that sharp points are consistent with 
continuity, as in diagram 4, but that gaps are taboo, as in diagrams c and d. This is indeed 
the case. Roughly speaking, therefore, a function that is continuous in a particular interval 
is one whose graph can be drawn for the said interval without lifting the pencil or pen from 
the paper—a feat which is possible even if there are sharp points, but impossible when gaps 
occur. 


Polynomial and Rational Functions 

Let us now consider the continuity of certain frequently encountered functions. For any 
polynomial function, such as g = g(v) in (6.11), we have found from (6.12) that jim, q 
exists and is equal to the value of the function at N. Since N is a point (any point) in the 
domain of the function, we can conclude that any polynomial function is continuous in its 
domain. This is a very useful piece of information, because palynomial functions will be 
encountered very often. 

What about rational functions? Regarding continuity, there cxists an interesting theorem 
(the continuity theorem) which states that the sum, difference, product, and quotient of any 
finite number of functions that are continuous in the domain are, respectively, also contin- 
uous in the domain. As a result, any rational function (a quotient of two polynomial func- 
tions) must also be continuous in its domain. 


The rational function 
2 


q= 9) = Pat 


Example 2 


Chapter 6 Comparative Statics and the Concept of Derivative 143 


is defined for all finite real numbers; thus its domain consists of the interval (—o0, oc). For 
any number Nin the domain, the limit of ¢ is (by the quotient limit theorem) 


A 2 
; lim(4v’) ane 
lim qg= 


v 
nd lim(v? 41) N41 
yon 





which is equal to g(N). Thus the three requirements of continuity are all met at N. More- 
over, we note that N can represent any point in the domain of this function; consequently, 
this function is continuous in its domain. 


The rational function 
v4 y2 —~4y—4 
v4 


is not defined at v= 2 and at v= —2. Since those two values of v are not in the domain, the 
function is discontinuous at v= —2 and y = 2, despite the fact that a limit of g exists as 
¥— —2 or 2. Graphically, this function will display a gap at each of these two values of v. 
But for other values of v (those which gre in the domain), this function is continuous, 





Differentiability of a Function 
‘The previous discussion has provided us with the tools for ascertaining whether any fune- 
tion has a limit as ils independent variable approaches some specific value. Thus we can try 
to take the limit of any function y = f(x) as x approaches some chosen value, say. xo. 
However, we can also apply the “limit” concept at a different level and take the limit of the 
difference quotient of that function, Ay/Ax, as Ax approaches zero. The outcomes of 
limit-taking at these two different levels relate to two different, though related, properties 
of the function f 

Taking the limit of the function y = f(x) itself, we can, in line with the discussion of 
the preceding subsection, examine whether the function fis continuous at x = x9. The con- 
ditions for continuity are (1) x = x9 must be in the domain of the function f, (2) y must have 
a limit as x > x, and (3) the said limit must be equal to f(x»). When these are satisfied, 
we can write 


lim S(4) = f(x) [continuity condition] (6.13) 


In contrast, when the “limit” concept is applicd to the difference quotient Ay/Ax as 
Ax + 0, we deal instead with the question of whether the function f is differentiable at 
X = xy, 1c. whether the derivative dy/dx exists at x = x9, or whether f’{x9) exists. The 
term differentiable is used here because the process of obtaining the derivative dy/dy is 
known as differentiation (also called derivation). Since f'(xy) exists if and only if the limit 
of Ay/Ax exists at x = xy as Ax — 0, the symbolic expression of the differentiability of 





fis 


Ay 
Fed = Jinx 
= tim Lethe) = fla) 


Jim, an [differentiability condition] (6.14) 


144 Part Three Comparative-Static Analysis 


FIGURE 6.4 


These two properties, continuity and differentiability, are very intimately related to each 
other. the continuity of fis a necessary condition for its differentiability (although, as we 
shall see later, this condition is not sufficient}. What this means is that, to be differentiable 
at x = xq, the function must first pass the test of being continuous at x = x9. To prove this, 
we shall demonstrate that, given a function y = f(x), its continuity atx = x9 follows from 
its differentiability at x + xo; ie., condition (6.13) follows from condition (6.14). Before 
doing this, however, let us simplify the notation somewhat by (1) replacing xg with the 
symbol N and (2) replacing (xo + Ax) with the symbol x. The latter is justifiable because 
the postchange valuc of x can be any number (depending on the magnitude of the change) 
and hence is a variable denotable by x. The equivalence of the two notation systems is 
shown in Fig, 6.4, where the old notations appear (in brackets) alongside the new. Note that, 
with the notational change, Ax now becomes (x — NV}, so that the expression "Ax > 0” 
becomes “x -+ N,” which is analogous to the expression v + N used before in connection 
with the function g = g(v). Accordingly, (6.13) and (6.14) can now be rewritten, respec- 
tively, as 





Jim fr) = FN) (6.13) 
F(N)= Tim fs" (6.14) 


What we want to show is, therefore, that the continuity condition (6.13’) follows from 
the differentiability condition (6.1 4’). First, since the notation x — N implies that x # N, 
so that x — V is a nonzero number, it is permissible to write the following identity: 


7) — fos LOM a wy (6.15) 
x-N 
y 
yah 


Fir) 
[ft Ax) 


funy 
If) 











FIGURE 6.5 


Chapter 6 Comparative Statics and the Concept of Derivative 145 


Taking the limit of each side of (6.15) as x > N yields the following results: 


Lefi side = jim f(x) - dim, SON) [difference limit theorem] 


lim f(x) — FON) LFV) is a constant] 
Lin tiny 


FOV lim x - lim ) [by (6.14’) and difference limit theorem] 
vo 1} 


Right side 


(<—N) [product limit theorem] 


= f(N)\N - N)=0 


Note that we could not have written these results, if condition (6.14') had not been granted, 
for if f"(N) did not exist, then the right-side expression (and hence also the lefi-side 
expression) in (6.15) would not possess a limit. If f’(N) does exist, however, the two sides 
will have limits as shown in the previous equations. Morcover, when the left-side result and 
the right-side result are equated, we get tim f(x) — f(N) = 0, which is identical with 
(6.13). Thus we have proved that continuily, as shown in (6.13’), follows from differentia- 
bility, as shown in (6.14'). In general, if a function is differentiable at every point in its 
domain, we may conclude that it must be continuous in its domain. 

Although diffcrentiability implies continuity, the converse is not true. That is, continu- 
ity is a necessary, but not a sufficient, condition for differentiability. To demonstrate this, 
we merely have to produce a counterexample. Let us consider the function 


y= f(x) =|x-2] +1 (6.16) 


which is graphed in Fig, 6.5, As can be readily shown, this function is not differentiable, 
though continuous, when x = 2. That the function is continuous at ¥ = 2 is easy te estab- 
lish. First, x = 2 is in the domain of the function. Second, the limit of y exists as x tends 
to 2; to be spccrfic, fim y= lim y = 1. Third, (2) is also found to be |. Thus all three 


Yo 
requirements of continuity are met. To show that the function fis net differentiable at 


you'r 2/41 











146 Part Three Comparative-Static Analysis 


x = 2, we must show that the limit of the difference quotient 


fan LO~ LD gg PLAIN ig 2d 
r20 x —-2 rd x-2 rd x—2 





does not exist. This involves the demonstration of a disparity between the left-side and the 
right-side limits. Since, in considering the right-side limit, x must exceed 2, according to the 
definition of absolute value in (6.8) we have |y — 2| =x — 2. Thus the right-side limit is 
la 2] x—2 


lim lim = lim 1l=1 
rod! x sox — 2 xob 








On the other hand, in considering the left-side limit, x must be less than 2; thus, according 
to (6.8), |x — 2| = —(x ~ 2). Consequently, the left-side limit is 


x2 -( -2 
tim = 2 Stim S22 = tim (-1) = =) 
yo32 X-2 xsd XH a2 





which is different from the right-side limit. This shows that continuity does not guarantee 
differentiability. In sum, all differentiable functions are continuous, but not all continuous 
functions are differentiable. 

In Fig, 6.5, the nondifferentiability of the function at x = 2 is manifest in the fact that 
the point (2, 1) has no tangent line defined, and hence no definite slope can be assigned to 
the point. Specifically, to the left of that point, the curve has a slope of —1, but to the right 
it has a slope of +1, and the slopes on the two sides display no tendency to approach a 
common magnitude at x = 2. The point (2, 1) is, of course, a special point; it is the only 
sharp point on the curve. At other points on the curve, the derivative is defined and the 
function is differentiable. More specifically, the function in (6.16) can be divided into two 
linear functions as follows: 





--2)4¢l=3-x" <3 
(e-2)4lax-1 («>2) 


Left part: ¥ 
Right part: ¥ 


The left part is differentiable in the interval (—oo, 2), and the tight part is differentiable in 
the interval (2, oc) in the domain. 

In general, differentiability is a more restrictive condition than continuity, because it re- 
quires something beyond continuity. Continuity at a point only rules out the presence of a 
gap, whereas differentiability rules out “sharpness” as well. Therefore, diffcrentiability 
calls for “smoothness” of the function (curve) as well as its continuity. Most of the specific 
functions employed in economics have the property that they are differentiable everywhere. 
When general functions are used, moreover, they are often assumed to be everywhere 
differentiable, as we shail in the subsequent discussion. 








EXERCISE 6.7 


1, Afunction y = f(x) is discontinuous at x = xo when any of the three requirements for 
continuity is violated at x = Xo. Construct three graphs to illustrate the violation of each 
of those requirements. 


Chapter 6 Comparative Statics and the Concept of Derivative 1A? 


. Taking the set of all finite real numbers as the domain of the function g = g(v) = 7 — 
5v-2; 
(a) Find the limit of @ as v tends to N (a finite real number). 
(b) Check whether this dimit is equal to g(N). 
(Q Check whether the function is continuous at N and continuous in its domain. 
v+2_ 
ve+2° 
(a) Use the limit theorems to find tim q, N being a finite real number. 
(b) Check whether this limit is equal to g(N). 
(Q Check the continuity of the function g(v) at Nand in its domain (—o<, 00). 





. Given the function g = g{v) = 


x? —9x+ 20 
. Gi = f(x) =———_—_-: 
iven yx f(x) x4 
(a) Is it possible to apply the quotient limit theorem to find the limit of this function as 
x-> 4? 


(6) s this function continuous at x = 4? Why? 
(c) Find:a function which, for x # 4, is equivalent to the given function, and obtain 
from the: equivalent function the limit of y.as x 4. 

. In the rational function:in Example 2,:the numerator is evenly divisible by the denomi- 
nator, and the quotient is v+ 1, Can we for that reason replace that function outright 
by q = v+ 1? Why or why not? 

. On the basis of the graphs of the six functions in Fig. 2.8, would you conclude that 
each such function’ is differentiable at every pointin its domain? Explain. 


Chapter 





Rules of Differentiation 
and Their Use in 
Comparative Statics 


The central problem of comparative-static analysis, that of finding a rate of change, can be 
identified with the problem of finding the derivative of some function y = f(x), provided 
only an infinitesimal change in x is being considered. Even though the derivative dy/dx is 
defined as the limit of the difference quotient g = g(v) as v = Q, it is by no means neces- 
sary to undertake the process of limit-taking each time the derivative of a function is 
sought, for there exist various rules of differentiation (derivation) that will enable us to 
obtain the desired derivatives directly. Instead of going into comparative-static models 
immediately, therefore, let us begin by learning some rules of differentiation. 


7.1 Rules of Differentiation for a Function of One Variable 





148 


First, let us discuss three rules that apply, respectively, to the following types of function of 
a single independent variable: y = k (constant function} and y = x” and p = cx" (power 
functions). All these have smooth, continuous graphs and are therefore differentiable 
everywhere. 


Constant-Function Rule 

The derivative of a constant function y = &, or f(x} = &, is identically zero, ic, is zero 
for all values of x. Symbolically, this rule may be stated as: Given y = f(x) =4, the 
derivative is 


dy — dk , 
—=—=6 =0 
dx ds oF) 
Alternatively, we may state the rule as: Given y = f(x) = &, the derivative is 
d d d 
Bo R= nn? 


Example 1 


Example 2 


Chapter 7 Rules of Differentiation and Their Use in Comparative Statics 149 


where the derivative symbol has been separated into two parts, d/dx on the one hand, and 
y [or f(Q2) or 4] on the other. The first part, d/dr, is an operator symbol, which instructs us 
to perform a particular mathematical operation. Just as the operator symbol ,/ instructs 
us to take a square root, the symbol d/dx represents an instruction to take the derivative of, 
or to differentiate, (some function) with respect to the variable x. The function to be oper- 
ated on (to be differentiated) is indicated in the second part; here it is » = f(x) =&. 

The proof of the rule is as follows. Given f(x) = &, we have f(NV) = & for any value 
of N. Thus the value of f’(N )—the value of the derivative at x = N—as defined in (6.13) 
1s 


aay cy, LOR FO) ek 
PMs eo Roy Toe 


Moreover, since N represents any value of. at all, the result /’( 7} = 0 can be immediately 
generalized to f’(x} = 0. This proves the tule. 

Jt is important to distinguish clearly between the statement f“(x) = 0 and the similar- 
looking but different statement /'(xo) = 0. By "(x) = 0, we mean that the derivative 
function f” has a zero valuc for ad? values of x; in writing f’(xp) = 0, on the other hand, we 
are merely associating the zero value of the derivative with a particular value of x, namely, 
X =X. 

As discussed before, the derivative of a function has its geometric counterpart in 
the slope of the curve. The graph of a constant function, say, a fixed-cost function 
Cr = f(Q) = $1,200, is a horizontal straight line with a zero slope throughout. Corre- 
spondingly, the derivative must also be zero for all values of Q: 





d d 

—Cr = 1200 =0 
pT ae 

Power-Function Rule 


The derivative of a power function v = f(x) =x" is nx"-!, Symbolically, this is ex- 
pressed as 





x" = py"! or f'lryanx’ t (7.1) 
«ont dy od 
= is = —x3 33.2 
The derivative of y = x? is de ant 3x6, 


The derivative of y = x° is ce =9x%, 


This rule is valid for any real-valued power of x; that is, the exponent can be any real 
number. But we shall prove it only for the casc where n is some positive integer. In the 
simplest casc, that of # = 1, the function is f(x} =x, and according to the rule, the 
derivative is 


I= fy ls") = 1 
dx 


150 Part Three Cumpurative-Static Analysis 


The proof of this result follows casily from the definition of f’(N) in (6.14'). Given 
J (x) = x, the derivative value at any value of x, say, x = N, is 
FO)— FM) eNO 
= lim —— = lim l=1 

x- N 


xoNy — ron 


f'(N) = lim 


xo 

Since N represents any value of x, it is permissible to write f’(x) = 1. This proves the rule 

for the case of » = 1. As the graphical counterpart of this result, we see that the function 
y = f(x) =x plots as a 45° line, and it has a slope of +1 throughout. 

For the cases of larger integers, » = 2, 3,..., let us first note the following identities: 


x — N? 








W =x+N [2 terms on the right] 
r_ 
ee} 
ra y+ Ne +N? [3 terms on the right] 

x— 

nye 
ad N etl a yet lp Neg pe eel 
x- 


[x terms on the right] (7.2) 


On the basis of (7.2), we can express the derivative of a power function f(x) =x” at 
x = N as follows: 


fO)-fN) yy MN" 


FO) tm pe att yo 
= limo tN bE) [by (7.2)] 
= jim wy tim, Nxt? peed Jim AN"! [sum limit theorem} 
=NT ENR tel Ja total of n terms] 
aan"! (73) 


Again, N is any value of x; thus this last result can be generalized to 
SQ) = ax" 
which proves the rule for 2, any positive integer. 
As mentioned previously, this rule applies even when the exponent 7 in the power ex- 


pression x” is not a positive integer. The following examples serve to illustrate its applica- 
tion to the latter cases. 


Example 3 Find the derivative of y = x°. Applying (7.1), we find 
4 oo oy hye 
a’ = Ox) =0 


Find the derivative of y= 1/x?. This involves the reciprocal of a power, but by rewriting the 
function as y = x~3, we can again apply (7.1) to get the derivative: 


doy 4 73 
dotnet [-Z] 


Example 4 


Example 5 


Example 6 
Example 7 


Example 8 


Chapter 7 Rules of Differentiation and Their Use in Comparative Statics 151 


Find the derivative of y = ./x. A square root is involved in this case, but since x = x'/, the 
derivative can be found as follows: 


feta fot ve 
dx 2 Dx o 


Derivatives arc themselves functions of the independent variable x. In Example |, for 
instance, the derivative is dy/dx = 3x?, or f'(x) = 3x?, so that a different value of x will 
result in a different value of the derivative, such as 

f= =3  f'(2) 302" = 
These specific values of the derivative can be expressed alternatively as 
dy dy 
dx|,2) dx |,-9 
but the notations /"(1} and f’(2) are obviously preferable because of their simplicity. 

It is of the utmost importance to realize that, to find the derivative values '(1), (2), 
etc., we must first diffcrentiate the function f(x), to get the derivative function f(x), and 
then let x assume specific values in f(x). To substitute specific values of x into the primi- 
tive function f(x) prior to differentiation is definitely not petrnissible. As an illustration, if 
we let + = 1 in the function of Example 1 before differentiation, the function will degen- 
erate into ¥ =. = 1—a constant function—which will yield a zero derivative rather than 
the correct answer of f’(x) = 3x7. 


=12 








Power-Function Rule Generalized 
When a multiplicative constant ¢ appears in the power function, so that f(x) = cx", its 
derivative is d 
ex" = cnx”! or f(x) =enx"! 

dx 
This result shows that, in differentiating cx”, we can simply retain the multiplicative con- 
stant c intact and then differentiate the term x” according to (7.1). 


Given y = 2x, we have dy/dx = 2x9 = 2 
Given f(x) = 4x3, the derivative is f(x) = 12x2, 
The derivative of f(x) = 3x7? is f(x) = -6x 3. 


For a proof of this new rule, consider the fact that for any value of x, say, x = N, the 
value of the derivative of f(x) = cx? is 


F(x) — fis) = Jim cx" — eM 











LO) = tim oy xiN xoN x—N = [im 
= lime lim —— a [product limit theorem] 
yoN “oN yo 
=c lim ~— M [limit of a constant] 
xan x N 


scant! [from (7.3)] 


152 Part Three Comparutive-Static Anatrsis 


In the view that N is any value of x, this last result ean be generalized immediately to 
f(x) = cnx"! which proves the rule, 





EXERCISE 7.1 


1. Find the derivative of each of the following functions: 
(a) y= xt? (Q ya78 (e) w= —4ul/2 
(D) y= 63 @) wu! (f) w= 4ulé 
2. Find the following: 
d yd daa 4 op 
@) Ber) © a sw @ aye 
Sogn Soy Fay? 
(b) 9x ao Oa ™ 
3. Find f’(1) and f'(2) from the following functions: 
(a) y= F(x) = 18x (O F(x) = Sx? (e) f(w) = bw? 
(b) y= FQ) = cx? (d) f(x) = Ext’3 () fv) = 35 


4. Graph a function f(x) that gives rise to the derivative function f'(x) = 0. Then graph a 
function g(x) characterized by g‘(xo) = 


7.2 Rules of Differentiation Involving 
Two or More Functions of the Same Variable 





Example 1 


The three rules presented in Sec. 7.1 arc cach concerned with a single given function f(x}. 
Now suppose that we have two differentiable (unctions of the same variable x. say, fx) and 
g(x), and we want to differentiate the sum, difference, product, or quotient formed with 
these two functions. In such circumstances, are there appropriate rules that apply? More 
concretely, given two funetions- -say, f(x) = 3x? and g(x) =9x" how do we get the 
derivative of, say, 3x? + 9x, or the derivative of (31°) 9x!?)? 


Sum-Difference Rule 
The derivative of a sum (difference) of two functions is the sum (difference) of the deriva- 
tives of the two functions: 


d 
fife tela oi wt pale =fite(o 


Fhe proof of this again involves the application of the definition of a derivative and of the 
various limit theorems. We shall omit the proof and, instead, merely verify its validity and 
illustrate its application, 


From the function y= 14x3, we can obtain the derivative dy/dx = 42x7. But 14x3 = 
5x}~9x3, so that y may be regarded as the sum of two functions f(x) = 5x? and 
g(x) = 9x3, According to the sum rule, we then have 
dy dies 3 Ge5,4g4 2 2 2 
= = 9x¢ = 27x° = 4. 
a aor + 9x") ra +a 15x? + 27x 2x 
which is identical with our earlier result. 





Example 2 


Example 3 


Example 4 


Chapter 7 Rules of Differentiation and Their Use in Comparative Statics 153 


This rule, which we stated in terms of two functions, can easily be extended to more 
functions. Thus, it is also valid to write 


“ya ty(z) th) = fotg (thx) 


The function cited in Example 1, y = 14x3, can be written as y= 2x? + 13x? — x3, The 
derivative of the latter, according to the sum-difference rule, is 
dy 


at (a0 +1323 — 03) = 6x? 4 39x? — 3x? = 42x? 


which again checks with the previous answer. 


This rule is of great practical importance. With it at our disposal, it is now possible to 
find the derivative of any polynomial function, since the latter is nothing but a sum of power 
functions. 


Stay + br 4.0) = 20x46 

dx 
d 4 3 3 2 3 2 
ax +2x° — 3x + 37) = 28x" + 6x" — 3 4+.0 = 28x" + 6x* — 3 


Note that in Examples 3 and 4 the constants ¢ and 37 do not really produce any effect on 
the derivative, because the derivative of a constant term is zero. In contrast to the mudzi- 
plicative constant, which is retained during differentiation, the additive constant drops 
out. This fact provides the mathematical cxplanation of the well-known economic principle 
that the fixed cost of a firm does not affect its marginal cost. Given a short-run total-cost 
function 


C=0-40?4+ 100475 


the marginal-cost function (for infinitesimal output change) is the limit of the quotient 
AC/AQ, or the derivative of the C function: 

dC 4 

a0 3Q° —-8G+10 
whereas the fixed cost is represented by the additive constant 75. Since the latter drops out 
during the process of deriving dC/dQ, the magnitude of the fixed cost obviously cannot 
affect the marginal cost, 

In general, if a primitive function y = f(x) represents a fozal function, then the deriva- 
tive function dy/dx is its marginal function. Both functions can, of course, be plotted 
against the variable x graphically; and because of the correspondence between the deriva- 
tive of a function and the slope of its curve, for each value of x the marginal function should 
show the slope of the total function at that value of x. In Fig. 7.1a, a lincar (constant-slope) 
total function is seen to have a constant marginal function. On the other hand, the nonlin- 
ear (varying-slope) total function in Fig. 7.1 gives rise to a curved marginal function, 
which lies below (above) the horizontal axis when the total function is negatively 
(positively) sloped. And, finally, the reader may note from Fig. 7.1¢ (cf. Fig. 6.5) that 


184 Part Three Comparative-Static Analysis 


FIGURE 7.1 





(total) 






a 2 
imarginal) 





















(a) (b) 
Pas 
ds 
5 5-x (#3) 
y= 
xo @>3) 
47 (otal) 
ay “1 ued 
2+ “| 1 @ey 
it (marginal) 
———___—_. 
H 
a 
S) | 2 4 4 5 6 7 * 
| ——$—$— el 
a4 





tc} 


“nonsmoothness” of a total function will result in a gap (discontinuity) in the marginal or 
derivative function. This is in sharp contrast to the everywhere-smooth total function in 
Fig, 7,14 which gives rise to a continuous marginal function. For this reason, the smooth- 
ness of a primitive function can be linked to the continuity of its derivative function. In par- 
ticular, instead of saying that a certain function is smooth (and differentiable} everywhere, 
we may alternatively characterize il as a function with a continuous derivative function, and 
refer to it as a continuously differentiable function. 

The following notations are often used to denote the continuity and the continuous 
differentiability of a function f 

fec™ or fec J is continuous 


fec™ o fec: f is continuously differentiable 


Example 5 


Chapter 7 Rules of Differentiation and Their Use in Comparative Staties 155 


where C, or simply C, is the symbol for the set of all continuous functions, and C", or 
C', is the symbol for the set of all continuously differentiable functions. 


Product Rule 

The derivative of the product of two (differentiable) functions is equal to the first function 
times the derivative of the second function plus the second function times the derivative 
of the first function: 





(pee) = = fol £ atx) V+ soe F(x) 


= fle wees +a io) (7.4) 
Tt is also possible, of course, to rearrange the terms and express the rule as 


(peace = fC) + fang'(x (74) 
x 


Find the derivative of y= (2x + 3)(3x7). Let f(x) = 2x +3 and g(x) = 3x2. Then it follows 
that f'{x) = 2 and g(x) = 6x, and according to (7.4) the desired derivative is 


Stax + 3y(3x?)] = (2x + 3)(6x) + 3x2)(2) = 18x? + 18x 


This result can be checked by first multiplying out f(x)g{x) and then taking the deriva- 
tive of the product polynomial. The product polynomial is in this case f(x)g0) = 
(2x + 3)(3x*) = 6x3 +. 9x, and direct differentiation does yield the same derivative, 
18x? + 18x. 


The important point to remember is that the derivative of a product of two functions is 
not the simple product of the two separate derivatives. Instead, it isa weighted sum of f‘(x) 
and g(x), the weights being g(x) and f(x), respectively. Since this differs [tom what intu- 
itive generalization leads one to expect, let us produce a proof for (7.4). According to 
(6.13), the value of the derivative of f(x)g(x) when x = NV should be 


Fgh) — FU 


d 
S[fegts]| = lim 





But, by adding aad subtracting f(x)g(N) in the numerator (thereby leaving the original 
magnitude unchanged), we can transform the quotient on the right of (7.5) as follows: 


Sgr) — F)RON) + FON) = FONIR(ND 
x-WN 


etx) ~ @(N) F(x) — FON) 
N 
a—N +8) x-N 
Substituting this for the quotient on the right of (7.5) and taking its limit, we then get 
F[yeeig0o = fim £3) im sna) 
Fx) — FON) 
x—-N 








=f) 


+ lim g(N) tim (7.5) 
ao BON 


156 Part Three Comparutive-Static Analysis 


The four limit expressions in (7.5') arc casily evaluated, The first one is f(.V), and the third 
is g(N) (limit of a constant), The remaining two are, according to (6.13), respectively, 
e(N) and f’(N). Thus (7.5’) reduces to 


d 
gs]. = SUN)8(N) + NDIN) 7.5") 


And, since N represents any value of x, (7.5") remains valid if we replace every N symbol 
by x. This proves the rule. 
Asan extension of the rule to the case of ¢aree functions, we have 


d 
Falla e)] =f Cra ho} + fg (A) 
t+ fOxdga'(x) — [ef (749) 7.6) 


In words, the derivative of the product of threc functions is equal to the product of the sec- 
ond and third functions times the derivative of the first, plus the product of the first and third 
functions times the derivative of the second, plus the product of the first and second func- 
tions times the derivative of the third. This result can be derived by the repeated application 
of (7.4). First treat the product g(x }i(x) as a single function, say, p(x), so that the original 
product of three functions will become a product of fwe functions, f(«)¢(x). To this, (7.4) 
is applicable. After the derivative of f(x)(x) is obtained, we may reapply (7.4) to the 
product g(x)h(x) = (x) to get (x). Then (7.6) will follow. The details are left to you as 
an exercise. 

The validity of a rule is one thing; its serviceability is something else. Why do we need 
the product rule when we can tesort to the alternative procedure of multiplying out the two 
functions (x) and g(x) and then taking the derivative of the product directly? One answer 
to this question is that the alternative procedure is applicable only to specific (numerical or 
parametric) functions, whereas the product rule is applicable even when the functions are 
given in the general form. Let us illustrate with an economic example. 


Finding Marginal-Revenue Function from 
Average-Revenue Function 
If we are given an average-revenuc (AR) function in specific form, 
AR=15-Q 
the marginal-revenue (MR) function can be found by first multiplying AR by Q to get the 
total-revenue (R) function: 


R=AR-0=(15-0)Q0=150-Q° 
and then differentiating R: 


dR 


MR= =15-2 
dQ g 


But if the AR function is given in the gencral form AR = (Q), then the total-revenue 
function will also be in a general form: 


R=AR-O= f(Q)-O 


FIGURE 7.2 


Chapter 7 Rules of Differentiation and Their Use in Comparative Statics. 187 


and therefore the “multiply out” approach will be to no avail. However, because R is a prod- 
uct of two functions of Q, namely, f(Q) and Q itself, the product rule can be put to work. 
Thus we can differentiate R to get the MR function as follows: 
aR , ; 
MRS 79 = MQ) 140 FOI = flO) + BF) (7.7) 
However, can such a general result tell us anything significant about the MR? Indeed it 
can, Recalling that f(Q) denotes the AR function, let us rearrange (7.7) and write 


MR — AR= MR ~ f(Q) = OF) 7.7’) 


This gives us an important relationship between MR and AR: namely, they will always 
differ by the amount O/'(Q). 

It remains to examine the expression Qf'(Q). Its first component Q denotes output and 
is always nonnegative, The other component, /‘(Q). represents the slope of the AR curve 
plotied against @. Since “average revenue” and “price” ate but different names for the same 
thing: 


the AR curve can also be regarded as a curve relating price P to output O: P = f(Q). 
Viewed in this light, the AR curve is simply the inverse of the demand curve for the prod- 
uct of the firm, i.e., the demand curve plotted after the P and Q axes are reversed, Under 
pure competition, the AR curve is a horizontal straight line, so that {'(Q) = 0 and, from 
(7.7), MR — AR = 0 for all possible valucs of Q. Thus the MR curve and the AR curve 
must coincide. Under imperfect competition, on the other hand, the AR curve is normally 
downward-sloping, as in Fig. 7.2, so that f(Q) < 0 and, from (7.7), MR — AR <0 for all 
positive levels of output. In this case, the MR curve must lie below the AR curve. 

The conclusion just stated is qualitative in nature; it concerns only the relative positions 
of the two curves. But (7.7/) also furnishes the quantitative information that the MR curve 
will fall short of the AR curve at any output level Q by precisely the amount O/’(Q). Let 
us look at Fig. 7.2 again and consider the particular output level V, For that output, the 


AR=P 
i 





AR = P= fiQ) 








158 Part Three Comparative-Stadic Analysis 


Example 6 
Example 7 


Example 8 


expression Q/"(Q) specifically becomes Nf’(N): if we can find the magnitude of NPN) 
in the diagram, we shall know how far below the average-revenue point G the correspond- 
ing marginal-revenue point must lie. 

The magnitude of N is already specificd. And fC.) is simply the slope of the AR curve 
at point G (where Q = N), that is, the slope of the tangent line JM measured by the ratio 
of two distances O.//OM. However, we sce that O//OM = HJ/ HG; besides, distance HG is 
precisely the amount of output under consideration, NV. Thus the distance Nf'(N), by which 
the MR curve must lie below the AR curve at output N, is 


wren) = HGH = Hs 
NYS NGG = 


Accordingly, if we mark a vertical distance KG = HJ directly below point G, then point K 
must be a point on the MR curve. (A simple way of accurately plotting KG is to draw a 
straight line passing through point H and parallel to /G; point X is where that line intersects 
the vertical line WG.) 

The same procedure can be used to locate other points on the MR curve. All we must do, 
for any chosen point G’ on the curve, is first to draw a tangent to the AR curve at G" that 
will meet the vertical axis at some point J’. Then draw a horizontal line from G” to the ver- 
tical axis, and label the intersection with the axis as #’. If we mark a vertical distance 
K’G' = #' J! directly below point G’, then the point K’ will be a point on the MR curve. 
This is the graphical way of deriving an MR curve from a given AR curve. Stricily speak- 
ing, the accurate drawing of a tangent line requires a knowledge of the valuc of the deriva- 
tive at the relevant output, that is, “(.’); hence the graphical method just outlined cannot 
quite exist by itself. An important exception is the casc of a linear AR curve, where the tan- 
gent to any point on the curve is simply the given linc itself, so that there is in effect no need 
to draw any tangent at all. Then the graphical method will apply in a straightforward way. 


Quotient Rule 
The derivative of the quotient of two functions, f(x)/g(x), is 


a f(x) _ fal) - #@)g'&) 

dx g(x) gx) 
In the numerator of the right-hand expression, we find two product terms, cach involving 
the derivative of only one of the two original functions, Note that f’(«) appears in the pos- 
itive term, and g’(x) in the negative term. The denominator consists of the square of the 


function g(x); that is, g(x) = [g(r)P. 


d Gay)-= +1) =x - 3) 3 


dx x41) (+1? ~ tly? 











df 5x \ _ 5(x2+1)—5x(2x) _ S(1 — x?) 
ates (2 +12 ~ (2 41% 





d far+b\ _ 2ax(cx)~ (ax? + BC) 
dx (° cx ) - (ex? 
_ ax -b) at —b 


(og? cx? 


Chapter 7 Rules of Differentiation and Their Use in Comparative Statics 159 


This rule can be proved as follows. For any value of x = NW, we have 


4 FO) gy FBO) — FON) 
dx g(x) |ey tN x-WN 





(7.8) 


The quotient expression following the limit sign can be rewritten in the form 
LN) — ANG 
&x)g(¥) xr-N 


By adding and subtracting f(N)g(N) in the numerator and rearranging, we can further 
transform the expression to 


] |= NY = SON)R(N) + FINN) = Ff “at 
a(x)e(N) x—N 











- FN) 


1 [aw Se oa 


= g 
gtx)g(N} -N 
Substituting this result into (7.8) and taking the limit, we then have 


d f(x) ! . f(x) — f(N) 
de ghey et EN) Lt 8M) oy 


dx g(x) 
— lim f(A) lirm gt) — 2tW) | 


xoN x -N 


gxy- gn 





1 
= >= [e( NSC) — FON) "(iN by (6.13 
MFO) = FONE thy 6.13) 
which can be generalized by replacing the symbo! N with x, because N represents any value 
of x, This proves the quotient rule. 


Relationship Between Marginal-Cost and 
Average-Cost Functions 
Asan economic application of the quotient rule, let us consider the rate of change of aver- 
age cost when output varies. 

Given a total-cost function C = C(Q), the average-cost (AC) function is a quotient of 
two functions of Q, since AC = C(Q)/Q, defined as long as Q > 0. Therefore, the rate of 
change of AC with respect to Q can be found by differentiating AC: 





£0) 1100-6 _ i Toyg, 62] gy 
a0 9 G al? “a | @ 
From this it follows that, for Q > 0, 
@ &Q) = &Q) 


> : : 
—= 20 f ¢ = 71 
too 2° | (EG (7.10) 
Since the derivative C’(Q) represents the marginal-cost (MC) function, and C(Q)/Q@ 
represents the AC function, the economic meaning of (7.10) is: The slope of the AC 


160 Part Three Comparative-Static Analysiv 


FIGURE 7.3 
100-- 







90-F 


MC . 3@? - 2401 60 


80 
707 


60 


Dollars 


505 
407 


304 
AC = @- 120 + 60 








curve will be positive, zero, or negative if and only if the marginal-cost curve lies above, 
intersects, or lies below the AC curve. This is illustrated in Fig. 7.3, where the MC and AC 
functions plotted are based on the specific total-cost function 


c=Q'- 1297+ 600 


To the left of @ = 6, AC is declining, and thus MC lies below it; to the right, the opposite 
is true. At Q = 6, AC has a slope of zero, and MC and AC have the same value.t 

The qualitative conclusion in (7.10) is stated explicitly in terms of cost functions. How- 
ever, its validity remains unaffected if we interpret C(Q) as any other differentiable total 
function, with C(Q)/Q and C’'(Q) as its corresponding average and marginal functions. 
Thus this result gives us a general marginal-average relationship. In particular, we may 
point out, the fact that MR lies below AR when AR is downward-sloping, as discussed in 
connection with Fig. 7.2, is nothing but a special case of the general result in (7.10). 





* Note that (7.10) does not state that, when AC is negalively sloped, MC must also be negatively 
sloped; it merely says that AC must exceed MC in that circumstance. At Q = 5 in Fig. 7.3, for 
instance, AC is declining but MC is rising, so that their slopes will have opposite signs. 





EXERCISE 7.2 
1. Given the total-cost function C = Q? —5Q?412Q+ 75, write out a variable-cost 
(VC) function. Find the derivative of the VC function, and interpret the economic 
meaning of that derivative. 


2. Given the average-cost function AC = Q? ~4Q+174, find the MC function. Is the 
given function more appropriate as a long-run or a short-run function? Why? 


Chapter 7 Rulew of Differentiation and Their Use in Comparative Statics 161 


3. Differentiate the following by using the product rule: 
(a) (9x2 — )G3x~1) (0 x*f4x + 6) (9 (2-31 4042) 
(b) Gx +10)(6x2—7x) — (d) (ax — ber?) (fF) (2 3) 
4, (a) Given AR = 60 — 3Q, plot the average-revenue curve, and then find the MR curve 
by the method used in Fig. 7.2. 
() Find the total-revenue function and the marginal-revenue function mathemati- 
cally from the given AR function. 
{c) Does the graphically derived MR curve in (a) check with the mathematically 
derived MR function in (b)? 
(d} Comparing the AR and MR functions, what can you conclude about their relative 
slopes? 

5. Provide a mathematical proof for the general result that, given a linear average curve, 
the corresponding marginal curve must have the same vertical intercept but will be 
twice as steep as the average curve, 

6. Prave the result in (7.6) by first treating g(x) h(x) as a single function, g(x)h(x) = 6(x), 
and then applying the product rule (7.4). 

7. Find the derivatives of: 


(a) (x? +.3)/x {c) 6x/(x + 5} 
(6) (x + 9)/x (A) (ax? + by{ex + a) 
8. Given the function f(x) = ax + 6, find the derivatives of: 
@ FX (b) xf) (9 WF) (a) fadyx 


9. (a) Isit true that fe C' => Fe? 
(b} Is it tue that fe Cs fe C? 
#9. Find the marginal and average functions for the following tatal functions and graph 
the results. 
Total-cost function: 
() C= 3Q@+7Q412 
Total-revenue function: 
(b) R= 10Q- 4 
Total-product function: 
(9 Q=oh+bl?-cl3 — (a-bc> 0) 


7.3 Rules of Differentiation 
Involving Functions of Different Variables 





in See, 7,2, we discussed the rules of differentiation of a sum, difference, product, or quo- 
tient of two {or more) differentiable functions of the same variable. Now we shall consider 
cases where there are two or more differentiable functions, cach of which has a distinct 
independent variable. 


Chain Rule 
If we have a differentiable function = = f(y), where y is in turn a differentiable function of 
another variable x, say, y = g(x), then the derivative of z with respect to x is equal to the 


162 Part Three Coznparative-Static Analysis 


Example 1 


Example 2 


Example 3 


derivative of z with respect to y, times the derivative of » with respect to x. Expressed 
symbolically, 

dz dz dy , , 

Sat te a(x rAR| 

mie Tee (7.11) 
This tule, known as the chain rude, appeals easily to intuition. Given a Ax, there must result 
a corresponding Ay via the function y = g(x), but this Ay will in turn bring about a Az 
via the function z = f(y). Thus there is a “chain reaction” as follows: 

Ax 2S ay 4 az 

The two links in this chain entail two difference quotients, Ay/Ax and Az/ Ay, but when 
they are multiplied, the Ay will cancel itself out, and we end up with 


Az Ay Az 
Ay Ox Ax 


a difference quotient that relates Az to Ax. If we take the limit of these difference quotients 
as Ax > 0 (which implies Ay -> 0), each difference quotient will turn into a derivative; 
i.e., we shall have (dz/dy}(dy/dx) = dz/dx. This is precisely the result in (7.11). 

In view of the function y = g(x), we can express the functionz = f(y)asz = f[g(x)], 
where the contiguous appearance of the two function symbols fand g indicates that this is 
a composite function (function of a function). It is for this reason that the chain rule is also 
referred to as the composite-function rule or function-of-a-function rule. 

The extension of the chain rule to three or mare functions is straightforward. If we have 
z= f0),y¥ = 2(x), and x = A(w), then 

dz yd. , , 
dw dy dx dy ™ Pek) 


and similarly for cascs in which more functions are involved. 





If z= 3y’, where y= 2x +5, then 


dz dz dy 
=f = 2y=12 
a dy de 6y(2) y (2x +5) 


If z= y— 3, where y = x°, then 


& = 10347) = 3x? 

The usefulness of this rule can best be appreciated when we must differentiate a function 
such as z= (x? + 3x — 2)"”, Without the chain rule at our disposal, dz/dx can be found 
only via the laborious route of first multiplying out the 17th-power expression. With the 
chain rule, however, we can take a shortcut by defining a new, intermediate variable 
y= x2 + 3x — 2, so that we get in effect two functions linked in a chain: 


z= and yar? +3x-2 
The derivative dz/dx can then be found as follows: 


dz_dzdy sing _ 2 16, 
Ga dy de 7 PORES 702 + 3x — 22 +3) 


Example 4 


Chapter 7 Rules of Differentiation and Their Use in Comparative Statics 163 


Given a total-revenue function of a firm & = f(Q), where output Q is a function of labor 
input 4, or Q= g(L), find dR/dL. By the chain rule, we have 
dR dk dQ, : 
a dQ dl F(Qg'(L) 
Translated into economic terms, dR /dQ is the MR function and dQ/di is the marginal- 
physical-product-of-labor (MPP;) function. Similarly, dR/dL has the connotation of the 
marginal-revenue-product-of-labor (MRP, } function. Thus the result shown constitutes the 
mathematical statement of the well-known result in economics that MRP; = MR-MPP;. 


Inverse-Function Rule 

If the function y = f(x) represents a one-to-one mapping, i-e., if the function is such that 
each value of y is associated with a unique value of x, the function f will have an inverse 
function x = f-'(y) (read: “x is an inverse function of y"), Here, the symbol f—! isa func- 
tion symbol which, like the derivative-function symbol /", signifies a function related (o 
the function f it does not mean the reciprocal of the function f(x). 

What the existence of an inverse function essentially means is that, in this case, not only 
will a given value of x yield a unique valuc of y [that is, y = /(x)], but also a given value 
of y will yield a unique value of x. To take a nonmuierical instance, we may exemplify the 
one-to-one mapping by the mapping from the set of all husbands to the set of all wives ina 
monogamous society. Each husband has a unique wife, and cach wife has a unique hus- 
band. In contrast, the mapping from the set of all fathers to the set of all sons is not one-to- 
one, because a father may have more than one son, albeit each son has a unique father, 

When x and y refer specifically to numbers, the property of one-to-one mapping is seen 
to be unique to the class of functions known as strictly monotonic (or monotone) functions. 
Given a function f(x), if successively larger values of the independent variable x always 
lead to successively larger values of f(x), that is, if 





x, > 2 => f0n) > fs) 


then the function fis said to be a strictly increasing function. If successive increases in x 
always lead to successive decreases in (.r), thatis, if 


x > x2 => flr) < flr) 


on the ather hand, the function is said to be a strictly decreasing function. In either of these 
cases, an inverse function f° ' oxists.t 

A practical way of ascertaining the strict monotonicity of a given function vy = f(x) is 
to check whether the derivative f’(x) always adheres to the same algebraic sign (not zero} 
for all vafues of x. Geometrically, this means that its slope is either always upward or always 


+ By omitting the adverb strictly, we can define monotonic (or monozane) functions as follaws: An 
increasing function is a function with the property that 

Ny > Xe => FCM) = F(X) [with the weak inequality >] 
and a decreasing function is one with the property that 

x, > x > Fm) < F(x) [with the weak inequality <] 
Note that, under this definition, an ascending (descending) step function qualifies as an increasing 
(decreasing) function, despite the fact that its graph contains horizontal segments. Since such 
functions do not have 4 one-to-one mapping, they do not have inverse functions. 


164 Part Three Comparative-Static Analysis 


Example 5 


Example 6 


downward. Thus a firm’s demand curve Q = /(P) that has a negative slope throughout is 
strictly decreasing. As such, it has an inverse function P = f-!(Q), which, as mentioned 
previously, gives the average-revenue curve of the firm, since P = AR. 


The function 
y=5x+25 


has the derivative dy/dx = 5, which is positive regardless of the value of x; thus the function 
is strictly increasing. It follows that an inverse function exists. in the present case, the inverse 
function is easily found by solving the given equation y = 5x +25 for x. The result is the 
function 


k=qy-S 


it is interesting to note that this inverse function is also strictly increasing, because 
dx/dy = q > 0 for all values of y. 


Generally speaking, if an inverse function exists, the original and the inverse functions 
must both be strictly monotonic. Moreover, if f~! is the inverse function of f, then fmust 
be the inverse function of f—!; that is, fand / ! must be inverse functions of each other. 

Tt is easy to verify that the graph of y = f(x) and that of x = f~!(y) arc one and the 
same, only with the axes reversed, If one lays the x axis of the f! graph over the x axis of 
the fgraph (and similarly for the y axis), the two curves will coincide. On the other hand, if 
the x axis of the f ! graph is laid over the y axis of the f graph (and vice versa), the two 
curves will become mirror images of each other with reference to the 45° line drawn 
through the origin. This mirror-image relationship provides us with an casy way of graph- 
ing the inverse function f~', once the graph of the original function fis given. (You should 
tty this with the two functions in Exampie 5.) 

Por inverse functions, the rule of differentiation is 


dx} 
dy dyfdx 


This means that the derivative of the inverse function is the reciprocal of the derivative of 
the original function; as such, dx /dy must take the same sign as dy/dx, so thai fis strictly 
increasing (decreasing), then so must be f!, 

As a verification of this rule, we can refer back to Example 5, where dy/dx was found to 
be 5, and dx /dy equal to }. These two derivatives are indeed reciprocal to each other and 
have the same sign. 

In that simple example, the inverse function is relatively easy to obtain, so that its 
derivative dx/dy can be found directly from the inverse function. As Example 6 shows, 
however, the inverse function is sometimes difficult to express explicitly, and thus direct 
differentiation may not be practicable. The usefulness of the inverse-function rule then 
becomes more fully apparent. 





Given y = x9 + x, find dx/dy. First of all, since 


ay 


a ox +120 


Chapter 7 Rules of Differentiation and Their Use in Comparative Statics 165 


for any value of x, the given function is strictly increasing, and an inverse function exists. To 
solve the given equation for x may not be such an easy task, but the derivative of the inverse 
function can nevertheless be found quickly by use of the inverse-function rule: 


de 1] 
dy ~ dyjdx 5x41 


The inverse-function rule is, strictly speaking, applicable only when the function involved 
is a one-to-one mapping. in fact, however, we do have some leeway. For instance, when 
dealing with a U-shaped curve (not strictly monotonic), we may consider the downward- 
and the upward-sloping segments of the curve as representing two separate functions, each 
with a restricted domain, and each being strictly manatonic in the restricted domain. To 
each of these, the inverse-function rule can then again be applied. 





EXERCISE 7.3 


1, Given y = ub + 2u, where u = 5 —.x?, find dy/dx by the chain rule, 

2, Given w = ay? and y = bx? 4 cx, find dw/dx by the chain rule, 

3. Use the chain rule to find dy/ax for the following: 

(@) y=Gxe?-13) (b) y= (293 — 5)? © y= (ax + by 

4. Given y = (16x + 3)~?, use the chain rule to find dy/dx. Then rewrite the function as 
y = 1/(16x + 3)? and find dy/dx by the quotient rule. Are the answers identical? 

5. Given y = 7x + 21, find its inverse function. Then find dy/dx and dx/dy, and verify the 
inverse-function rule, Also verify that the graphs of the two functions bear a mirror- 
image relationship to each other. 

6. Are the following functions strictly monotonic? 

@ys—Ft5  (x>d) 
(b) y=4x5 422 43x 
For each strictly monotonic function, find ax/dy by the inverse-function rue. 


7.4 Partial Differentiation 





Hitherto, we have considered only the derivatives of functions of a single independent vari- 
able. In comparative-static analysis, however, we are likely to encounter the situation in 
which several parameters appear in a model, so that the equilibrium value of cach endoge- 
nous variable may be a function of more than one parameter. Therefore, as a final prepara- 
tion for the application of the concept of derivative to comparative statics, we must learn 
how to find the derivative of a function of more than one variable, 


Partial Derivatives 
Let us consider a function 
¥ = fy Xt, tn) (7.12) 


where the variables x; (i = 1,2,..., #) are all independent of one another, so that cach can 
vary by itself without affecting the others. If the variable x, undergoes a change Ax, while 


166 Part Three Comparutive-Static Analysis 


Example 1 


Xp, -..,%, all remain fixed, there will be a corresponding change in y, namely, Ay. The 
difference quotient in titis case can be expressed as 


Ay _ f(xy + Axi xa, 0.0588) — fi 12... tu) 
Ax, ~ Ax, 





(7.13) 


Tf we take the limit of Ay/Ax) as Ax, — 0, that limit will constitute a derivative. We call 
it the partial derivative of y with respect to x), to indicate that all the other independent 
variables in the function are hcld constant when taking this particular derivative. Similar 
partial derivatives can be defined for infinitesimal changes in the other independent vari- 
ables, The process of taking partial derivatives is called partial differentiation. 

Partial derivatives are assigned distinctive symbols. In lieu of the letter @ (as in dv/dx), 
we employ the symbol 4, which is a variant of the Greek & (lowercase delta). Thus we 
shall now write Ay/x;, which is read: “the partial derivative of » with respect to x;.” The 


. . . a, 
partial-derivative symbol sometimes is also written as Free that case, its 4/0x; part can 
x; 


be regarded as an operator symbol instructing us to take the partial derivative of (some 
function) with respect to the variable x;. Since the function involved here is denoted in 
(7.12) by f, it is also permissible to write df/3x;. 

Ts there also a partial-derivative counterpart for the symbol f’(x) that we used before? 
The answer is yes. Instead of f”, however, we now use /), (2, ete., where the subscript in- 
dicates which independent variable (alone) is being allowed to vary. §f the function in (7.12) 
happens to be written in terms of unsubscripted variables, such as y = f(u. v. w), then the 
partial derivatives may be denoted by f,, f,, and fy rather than fi, 2, and fi. 

Tn line with these notations, and on the basis of (7.12) and {7.13}, we can now define 





Ay 
im 
and Ax 





a 
A= 2 


ay 


as the first in the set of # partial derivatives of the function /- 


Techniques of Partial Differentiation 

Partial differentiation differs from the previously discussed differentiation primarily in that 
we must hold (2 ~ 1) independent variables constant while allowing one variable to vary. 
Inasmuch as we have learned how to handle constanis in differentiation, the actual differ- 
entiation should pose little problem. 


Given y = (x1, x2) = 3x? + x1x2 + 4x4, find the partial derivatives. When finding ay/ax1, 
(or f,), we must bear in mind that x2 is to be treated as a constant during differentiation. 
As such, x2 will drop out in the process if it is an additive constant (such as the term 4x2) but 
will be retained if it is a multiplicative constant (such as in the term x12). Thus we have 

ay 

— =f, =6 x 

am 1 = OX] + X2 


Similarly, by treating x1 as a constant, we find that 


— = f= x +8 


Example 2 


Example 3 


Chapter 7 Rules of Differentiation and Their Use in Comparative Statics 167 


Note that, like the primitive function f, both partial derivatives are themselves functions 
of the variables x; and x). That is, we may write them as two derived functions 


f= AOn%2) and fe = fea, x2) 


For the point (x1, 2) = (1, 3) in the domain of the function f, for example, the partial 
derivatives will take the following specific values: 


AM, 3)=60)+3=9 and — (1,3) = 148) = 25 


Given y= f(u,v) = (4+ 4)(3u4 2¥), the partial derivatives can be found by use of the 
product rule, By holding y constant, we have 


fy = (u+ 43) + 18u + 2) = 208u + v +6) 
Similarly, by holding u constant, we find that 
fy = (u +4)(2) + O(3u + 2v) = 2(u 4+ 4) 
When u = 2 and vy = 1, these derivatives will take the following values: 


f,(2, 1} = 2(13) = 26 and £2, 1) = 2(6) = 12 


Given y = (3u - 2y)/(u? + 3¥), the partial derivatives can be found by use of the quotient 
rule: 


dy — 3(u? + 3v) -2uBu-2v)  -3u2 + 4uv+ ov 


au (2 + 3v? (+30? 





ay ~2? + 3¥) ~ 3@3u— 2v) — —u(2u +9) 
ay (u2 + 3vy? ~ (we + 3v)? 





Geometric Interpretation of Partial Derivatives 

As a special type of derivative, a partial derivative is a measure of the instantaneous rates 
of change of some variable, and in that capacity it again has a geometric counterpart in the 
slope of a particular curve. 

Let us consider a production function Q = O(K, L), where Q, K, and L denote output, 
capital input, and labor input, respectively. This function is a particular two-variable ver- 
sion of (7,12), with » = 2, We can therefore define two partial derivatives 9Q/4K (or Ox) 
and4Q/aL (or Q,). The partial derivative Q x relates to the rates of change of output with 
respect to infinitesimal changes in capital, while labor input is held constant. Thus Ox 
symbolizes the marginal-physical-product-of-capital (MPP) function. Similarly, the par- 
tial detivative Q, is the mathematical representation of the MPP, function. 

Geometrically, the production function Q = Q(X, L} can be depicted by a production 
surface in a 3-space, such as is shown in Fig, 7.4. The variable Q is plotted vertically, se 
that for any point (K, £) in the base plane (KL plane), the height of the surface wiil indi- 
cate the output YJ. The domain of the function should consist of the entire nonnegative 
quadrant of the base plane, but for our purposcs it is sufficient to consider a subset of it, the 








168 Part Three Comparative-Staric Analysis 


FIGURE 7.4 Q 

















Ky 












































ly 


rectangle OX,BZ,. As a consequence, only a small portion of the production sutface is 
shown in the figure. 

Let us now hold capital fixed at the level KX and consider only variations in the input L. 
By setting K = Ko, all points in our (curtailed) domain become irrelevant except those on 
the line segment KB. By the same token, only the curve K,CDA {a cross section of the 
production surface) is germane to the present discussion. This curve represents a total- 
physical-product-of-labor (TPP,,) curve for a fixed amount of capital XK = Ko; thus we 
may read from its slope the rate of change of Q with respect to changes in L while K is held 
constant, It is clear, therefore, that the slope of a curve such as K,CDA represents the geo- 
metric counterpart of the partial derivative O, . Once again, we note that the slope of a total 
(TPP, ) curve is its corresponding marginal (MPP; = Q;) curve. 

As mentioned carlier, a partial derivative is a function of all the independent variables of 
the primitive function. That Q,, is a function of L is immediately obvious from the K,CDA 
curve itself. When L = £1, the value of Q; is equal to the slope of the curve at point C; but 
when /, = Ls, the relevant slope is the one at point /), Why is Q; also a function of K? The 
answer is that K can be fixed at various levels, and for cach fixed level of X, there results a 
different TPP; curve (a different cross section of the production surface), with inevitable 
repercussions on the derivative QO; . Hence Qy is also a function of X. 

An analogous interpretation can be given to the partial derivative Qx. If the labor input 
is held constant instead of K (say, at the level of Zo), the line segment Ly 8 will be the rel- 
evant subset of the domain, and the curve £4 will indicate the relevant subset of the pro- 
duction surface. The partial derivative Ox can then be interpreted as the slope of the curve 
LoA—bearing in mind that the K axis extends from southeast to northwest in Fig. 7.4. It 
should be noted that Qx is again a function of both the variables £ and K. 


Gradient Vector 
All the partial derivatives of a function y = f(4), 2... %,) can be collected under a sin- 
gle mathematical entity called the gradient vector, or simply the gradient, of function f: 


grad f(x1, M2... tn) = CA fay eee fa) 


Chapter 7 Rules of Differentiation and fhetr Use in Comparative Suaies 169 


where fj y/dx;. Note that we are using parentheses rather than brackets here in writing 
the vector. Alternatively, the gradient can be denoted by Vf(11.x2,....42), where V 
(vead: “del’’) is the inverted version of the Greck letter A. 

Since the function fhas » arguments, there are altogether » partial derivatives: hence, 
grad fis an n-vector. When these detivatives are evaluated at a specific point (x1, 
29... Xn0) in the domain, we get grad f(a19, 120. -.-, Xan}. a vector of specific deriva- 
tive values. 








Example 4 The gradient vector of the production function Q= Q(K, L) is 
VQ=VQAK, £) =(Qx, Q) 
EXERCISE 7.4 
1. Find ay/ax, and ay/dx2 for each of the following functions: 
(@ y= 2x} - 11 xpx2 + 3x5 (Oy = (2x; + 3)Qx2 ~ 2) 
(b) y= 7x, + 6x1 xe ~- 9x3 (d) y = (xq + 3)/09 - 2) 
2. Find f, and f, from the following: 
2 3 2x~-3y 
(@) FO Y) = xe + Say ¥ © fx Y= rors 
2] 
(B) flx, y) = 2 — 3x -2) (d) Fx, y= — 


3. From the answers to Prob. 2, find £,(1, 2)—the value of the partial derivative f, when 
x= 1 and y = 2—for each function, 

4. Given the production function Q = 96K°31°7, find the MPPx and MPP, functions. Is 
MPP a function of K alone, or of both K and £? What about MPP, ? 

5. ff the utility function of an individual takes the form 


U = U6, 2) = G0 + BO + 33 


where U is total utility, and x7 and x2 are the quantities of two commodities consumed: 

a) Find the marginal-utility function of each of the two commodities. 

<b) Find the value of the marginal utility of the first commodity when 3 units of each 
commodity are consumed. 

6. The total money supply M has two components: bank depasits D and cash holdings C, 
which we assume to bear a constant ratio C/D = ¢,0< ¢ <1. The high-powered 
money H is defined as the sum of cash holdings held by the public and the reserves 
held by the banks. Bank reserves are a fraction of bank deposits, determined by the 
reserve ratior,0<1r <1. 

{a) Express the money supply M as a function of high-powered money H. 

5) Would an increase in the reserve ratio r raise or lower the money supply? 

{c) How would an increase in the cash-deposit ratio c affect the money supply? 
7. Write the gradients of the following functions: 

@ xy, 2a + +24 

() A(X, y= xyz 


170 Part Three Comparutive-Static Analysis 


7.5 Applications to Comparative-Static Analysis 





Equipped with the knowledge of the various rules of differentiation, we can at last tackle 
the problem posed in comparative-static analysis: namely, how the equilibrium value of an 
endogenous variable will change when there is a change in any of the exogenous variables 
or parameters. 


Market Model 


Furst let us consider again the simple one-commodity market modcl of (3.1). That model 
can be written in the form of two equations: 


Q=a-6P (a,b > 0) [demand] 
Q=-c+dP {c,d > 0) [supply] 


with solutions 








ate 

I 14 
b+d (7.14) 
ad — be 

i 7.15 

2 b+d (7-15) 


These solutions will be referred to as being in the reduced form: The two endogenous vari- 
ables have been reduced to explicit expressions of the four mutually independent parame- 
ters a, 6, c, and 
To find how an infinitesimal change in one of the parameters will affect the value of P*, 
one has only to differentiate (7.14) partially with respect to cach of the parameters. If the 
sien of a partial derivative, say, 4P*/da, can be determined from the given information 
about the parameters, we shall know the direction in which P* will move when the paratn- 
eter a changes; this constitutes a qualitative conclusion. If the magnitude of 0P"/aa can be 
ascertained, it will constitute a quantitative conclusion. 
Similarly, we can draw qualitative or quantitative conclusions from the partial deriva- 
tives of G* with respect to cach parameter, such as Q*/da. To avoid misunderstanding, 
however, a clear distinction should be made between the two derivatives 60*/3a and 
4Q/da. The latter derivative is a concept appropriate to the demand function taken alone, 
and without regard to the supply function. The derivative 6Q*/da pertains, on the other 
hand, to the equilibrium quantity in (7.15) which, being in the nature of a solution of the 
model, takes into account the interaction of demand and supply together. To emphasize this 
distinction, we shall refcr to the partial derivatives of P* and Q* with respect to the param- 
eters as comparalive-static derivatives. The possibility of confusion between 4Q*/da and 
4Q/0a is precisely the reason why we have chosen to use the asterisk notation, as in O* to 
denote the equilibrium valuc. 
Concentrating on P* for the time being, we can get the following four partial derivatives 
from (7.14): 
oP* 1 
da ~ b+d 
ap O¢b4+a)—lfa4+e) —-(a4+ce) 
ab (b+4)? ~ (b+d? 








1 
aramcter @ has the ficient 
[pe cter @ has the coefficient mal 


[quotient rule] 





Chapter 7 Rules of Differentiation and Their Use in Comparative Statics 171 


aps 1 f_ apt 
de b+d\ ba 
apt Ob+d)—-lMat+e) —{a+e) (_ apt 
ad (b+ay ~ (b+a \~ ab 











Since all the parametets are restricted to being positive in the present model, we can 
conclude that 
oP*  aP* ap*  aP* 


=—>0— and 


Me Be ab ad? (7.16) 








For a fuller appreciation of the results in (7.16), let us look at Fig. 7.5, where cach dia- 
gram shows a change in one of the parameters. As before, we are plotting Q (rather than P) 
on the vertical axis. 

Figure 7.5a pictures an increase in the parameter a (to a’). This means a higher vertical 
intercept for the demand curve, and inasmuch as the parameter 4 (the slope parameter) is 
unchanged, the increase in @ results in a parallel upward shift of the demand curve from D 


FIGURE 7.5 Q Uncrease in @) Q (Increase in 4) 

















te} a) 


172 Part Three = Comparative-Static dnatysis 


to D’. The intersection of D’ and the supply curve S determines an equilibrium price P*, 
which is greater than the old equilibrium price P*. This corroborates the result that 
'P* /da > 0, although for the sake of exposition we have shown in Fig. 7.5¢ 4 much larger 
change in the parameter a than what the concept of derivative implies. 

The situation in Fig. 7.5 has a similar interpretation; but since the increase takes place 
in the parameter ¢, the result is a parallel shift of the supply curve instead. Note that this 
shift is downward because the supply curve has a vertical intereept of —c; thus an increase 
inc would mean a change in the intercept, say, from —2 to —4, The graphical comparative- 
static result, that P* exceeds P*, again conforms to what the positive sign of the derivative 
aP* /de would lead us to expect. 

Figures 7.56 and 7.5d illustrate the effects of changes in the slope parameters 4 and d 
of the two functions in the model. An increase in 6 means that the slope of the demand 
curve will assume a larger numerical (absolute) value; i.c., it will become steeper. In 
accordance with the result 4P*/45 < 0, we find a decrease in P* in this diagram. The 
increase in af that makes the supply curve steeper also results in a decrease in the equilib- 
rium price. This is, of course, again in line with the negative sign of the comparative-static 
derivative dP*/dd. 

Thus far, all the results in (7.16) seem to have been obtainable graphically. If so, why 
should we bother to usc differentiation at all? The answer is that the differentiation 
approach has at least two major advantages. First, the graphical technique is subject to a 
dimensional restriction, but differentiation is not. Evcn when the number of endogenous 
variables and parameters is such that the equilibrium state cannot be shown graphically, we 
can nevertheless apply the differentiation techniques to the problem. Second, the difleren- 
tiation method can yield results that are on a higher level of generality. The results in (7.16) 
will remain valid, regardless of the specific values that the parameters a, b. c, and d take, as 
long as they satisfy the sign restrictions. So the comparative-static conclusions of this 
model are, in effect, applicable (o an infinite number of combinations of (linear} demand 
and supply functions. In contrast, the graphical approach deals only with some specific 
members of the family of demand and supply curves, and the analytical result derived 
therefrom is applicable, strictly speaking, only ta the specific functions depicted. 

This discussion serves to illustrate the application of partial differentiation to comparative- 
static analysis of the simple market modcl, but only half’ of the task has actually been 
accomplished, for we can also find the comparative-static derivatives pertaining to Q*. This 
we shall leave to you as an exercise. 





National-Income Model 
In place of the simple national-income model discussed in Chap. 3, let us now work with a 
slightly enlarged model with three endogenous variables. Y (national income), C (con- 
sumption), and T (taxes): 
Y=C+ihht+Go 
C=at+ p(y —-T) (a@>0; Vahl) (7.17) 
T=y+sr (y>0; 0<8 <1) 


The first equation in this system gives the equilibrium condition for national income, while 
the second and third equations show, respectively, how C and Tare determined in the model. 


Chapter 7 &ules of Differentiation and Their Use in Caniparative Statics 173 


The restrictions on the values of the parameters a, 8, y, and & can be explained thus: a 
is positive becausc consumption is positive even if disposable income (Y ~ 7) is zero; B is 
a positive fraction because it represents the marginal propensity to consume; y is positive 
because even if Y is zero the government will still have a positive tax revenue (from tax 
bases other than income); and finally, & is a positive fraction because it represents an 
income tax rate, and as such it cannot exceed 100 percent. The cxagcnous variables fy 
(investment) and Gp (government expenditure} are, of course, nonnegative. All the param- 
eters and exogenous variables are assumed to be independent of one another, so that any 
one of them can be assigned a new valuc without affecting the others. 

This mode] can be solved for ¥* by substituting the third equation of (7.17) into the sec- 
ond and then substituting the resulting equation into the first. The equilibrium income (in 
reduced form) is 





yee DT BY + lot Go 
~  1-pt pe 
Sumilar equilibrium values can also be found for the endogenous variables C and T, but we 
shall concentrate on the equilibrium income. 
From (7.18), there can be obtained six comparative-static derivatives. Among these, the 
following three have special policy significance: 


(7.18) 








ay* 1 
ag, p+ Bs ° (7.19) 
aye Bp 
jy T-B +R” (7.20) 
ay* — —Bla — By + lo + Go) —pY* 
= = < 7.18)] (7.21 
a8 (l-B+ psy 1—B+B8 o [by (7.18)] ¢ ) 


The partial derivative in (7.19) gives us the government-expenditure multiplier. It has a pos- 
itive sign here because f is less than |, and §é is greater than zero. If numerical valucs arc 
given for the parameters 6 and 6, we can also find the numerical value of this multiplier 
from (7.19). The derivative in (7.20) may be called the nzoaincome-tax multiplier, because 
it shows how a change in y, the government revenue from nonincome-tax sources, will af- 
fect the equilibrium income. This multiplier is negative in the present model because the 
denominator in (7.20) is positive and the numerator is negative. Lastly, the partial deriva- 
tive in (7.21}—which is not in the nature of a multiplier, since it does not relate a dollar 
change to another dollar change as the derivatives in (7.19) and (7.20) do—tells us the 
extent to which an increase in the income tax rate 3 will lower the equilibrium income. 

Again, note the difference between the two derivatives dY*/Gy and @Y/0Go. The 
former is derived from (7.18), the expression for the equilibrium income. The latter, 
obtainable from the first equation in (7.17), is 2¥/AGy = 1, which is altogether different in 
magnitude and in concept. 


Input-Output Model 


The solution of an open input-output model appears as a matrix equation «* = (/ — A)~'d. 
If we denote the inverse matrix (7 — A)~' by ¥ = [uy,], then, for instance. the solution for 





174 Part Three Comparative-Static Analysis 


a three-industry economy can be written as x* = Vd, or 
af By BD By ad, 
xP] =] vy vy dy (7.22) 
aS vy ty vas | Las 
What are the rates of change of the solution values x7 with respect to the exogenous final 
demands d), do, and dy? The general answer is that 
oy (k= 1,23 (7.23) 
— =y k= 1,2, . 
oa J ) 


To sce this, let us multiply out Vd in (7.22) and express the solution as 


xt vid + vray + v3 
xy |] =] vd) + ud) + v23d3 
xt usd) + U32d2 + U33d3 


In this system of three equations, each one gives a particular solution value as a function 
of the exogenous final demands. Partial differentiation of these produces a total of nine 
compatative-static derivatives: 














axt axt axt 
Fy rr ne 
ax} ax3 

=p = 7.23 
a vi Bas U2 ¢ ) 
ax} Oxy 
ad 





This is simply the expanded version of (7.23). 
Reading (7.23’) as three distinct columns, we may combine the three derivatives in cach 
column into a matrix (vector) derivative: 





* , , 
ax* a |" Un ax* YR ax* v3 
— =—] xy [=] um — = | bn ay | UB (7.23") 
ad, adi | Le v1 Bd) ve bd; vp 


Since the three column vectors in (7.23”) are merely the columns of the matrix V, by fur- 
ther consolidation we can summarize the nine derivatives in a single matrix derivative 
ax*/ad. Given x” = Vd, we can simply write 


+ Vr V2 Us 
ax “1 
a =], vy wy | =h =U — Aj} 
Uy M3233 


Thus, (7 — A)~!, the inverse of the Leonticf matrix, gives us an ordered display of all the 
comparative-static derivatives of our open input-output model, Obviously, this matrix 
derivative can easily be extended from the present threc-industry model to the general 
n-industry case. 

Comparative-static derivatives of the input-output model are useful as tools of economic 
planning, for they provide the answer to the question: If the planning targets, as reflected in 


Chapter ? Rules of Differenciarion and Their Use in Compararive Statics 175 


(d\, da, ..., dy), are revised, and if we wish to take care of all direct and indirect require- 
ments in the economy so as to be completely free of bottlenecks, how must we change the 
output goals of the n industries? 





EXERCISE 7.5 

1. Examine the comparative-static properties of the equilibrium quantity in (7,15), and 
check your results by graphic analysis. 

2. On the basis of (7.18), find the partial derivatives JY*/dfp, JY*/dw, and a¥*/ag. Inter- 
pret their meanings and determine their signs. 

3. The numerical input-output model (5.21) was solved in Sec. 5.7, 
(a) How many comparative-static derivatives can be derived? 
(b) Write out these derivatives in the form of (7.23) and (7.23”). 


7.6 Note on Jacobian Determinants 





Our study of partial derivatives was motivated solely by comparative-static considerations. 
But partial derivatives also provide a means of testing whether there exists functional 
(linear or nonlinear) dependence among a set of # functions in n variables. This is related 
to the notion of Jacobian determinants (named after Jacobi). 

Consider the two functions 


y= 2x) + 3x2 





> 3 (7.24) 
Py = AN) + 12x) x2 + 9x) 
If we get all the four partial derivatives 
dy é O¥2 dyz 
se es eS ee ee 
ax; 5 ax, Oxy 





and arrange therm into a square matrix in a prescribed order, called a Jacobian matrix and 
denoted by /, and then take its determinant, the result will be what is known as a Jacobian 
determinant (or a Jacobian, for short), denoted by |: 





an an 
ax; Ox) 2 3 
J\= = 7.25 
II aya aya jes 4120 (12x; + 18x) ¢ ) 
Ox, Oxy 


Far economy of space, this Jacobian is sometimes also expressed as 


AQ, ¥2) 


\d| = 











ACC), ¥2) 
More generally, if we have # differentiable functions in # variables, not necessarily linear, 
v= PMO dae 


vn = PO x2, - 


(7.26) 





Vn = FX XD. Xn) 


176 Part Three = Comparative-Static Analysis 


where the symbol f” denotes the nth function (and not the function raised to the nth 
power), we can derive a total of n? partial derivatives. Adopting the notation ff = ay’ /dxj, 
we can write the Jacobian 








Wis BO) Yay =n) 
(x1, X2,- Xn) 
ay fier -- ayn far,| [flo -- 
=|: : ls: (7.27) 
OY fOM 7+ OY, / OR, Poo Sy 


A Jacobian test for the existence of functional dependence among a set of n functions is 
provided by the following theorem: The Jacobian |/| defined in (7.27) will be identically 
zero for all values of x), ...,%, ifand only if the # functions f',...,_/” in (7.26) are fune- 
tionally (linearly or nonlinearly) dependent. 

As an example, for the two functions in (7.24) the Jacobian as given in (7.25) has the 
value 





|J| = Q4x) + 36x) — (24x, + 36x) = 0 


That is, the Jacobian vanishes for all values of x) and x2. Therefore, according to the theo- 
rem, the two functions in (7.24) must be dependent. You can verify that y» is simply y1 
squared; thus they are indeed functionally dependent. here nonlinearly dependent. 

Let us now consider the special case of linear functions. We have earlicr shown that the 
rows of the coefficient matrix 4 ofa linear-equation system 


ad bare +o + aint = 


ay) + annXa bo tank n = a (7.28) 





GAN + Ona Fo + anda = ay 


are linearly dependent if and only if the determinant |4| = 0. This result can now be inter- 
preted as a special application of the Jacobian criterion of functional dependence. 

Take the left side of each equation in (7.28) as a separate function of the # variables 
X1,-+.;%_, and denote these functions by 1, ..., 7. The partial derivatives of these func- 
tions will turn out to be Ay) /Ax) = a11, 8p) /9xz = a2, otc., so that we may write, in gen- 
eral, dy;/4x; = aj. In view of this, the elements of the Jacobian of these # functions will 
be precisely the elements of the cocfficient matrix A, already arranged in the correct order. 
That is, we have |./} = |.Al, and thus the Jacobian criterion of functional dependence among 
Vy, +++) Ye—OF, What amounts to the same thing, lincar dependence among the rows of the 
coefficient matrix A—is equivalent to the criterion | 4| = 0 in the present linear case. 

We have discussed the Jacobian in the context of a system of # functions in n variables. 
It should be pointed out, however, that the Jacobian in (7.27) is defined even if each fune- 
tion in (7.26) contains mere than 7 variables, say, n + 2 variables: 


Ve = Hy Xa Ange Hngz) E12, a) 


In such a case, if we hold any two of the variables (say, x,41 and x,42) constant, or treat 
them as parameters, we will again have n functions in exactly # variables and can form a 


Chapter 7 Rules of Differentiation and Their Use in Comparative Statics 177 


Jacobian. Moreover, by holding a different pair of the x variables constant, we can form a 
different Jacobian. Such a situation will indeed be encountered in Chap. & in connection 
with the discussion of the implicit-function theorem. 





EXERCISE 7:6 


1. Use Jacobian determinants to test the existence of functional dependence between the 
paired functions. 
() yi = 3x7 +40 
Yo = Oxt + Sx8(x2 +4) + xo€x2 +8) 412 
(b) yy = 3x? + 2x3 
yo=Sx +7 
2. Consider (7.22) as a set of three functions x7 = f(a), dh, dh) (with } = 1, 2, 3). 
(@) Write out the 3 x 3 Jacobian. Does it have some relation to (7.23)? Can we write 
U1= VI? 
(8) Since V (i ~ A)“, can we conclude that {Y| 4 0? What can we infer from this 
about the three equations in. (7.22)? 


Chapter 


178 








Comparative-Static 
Analysis of General- 
Function Models 


The study of partial derivatives has enabled us, in Chap. 7, to handle the simpler type of 
comparative-static problems, in which the equilibrium solution of the model can be explic- 
itly stated in the reduced form, In that case, partial differentiation of the solution will 
directly yicld the desired comparative-static information. You will recall that the definition 
of the partial derivative requires the absence of any functional relationship among the 
independent variables (say, x;), so that x, can vary without affecting the values of x2. 
xj, -.-.2,- As applied to comparative-static analysis, this means that the parameters and/or 
exogenous variables which appear in the reduced-form solution must be mutually indepen- 
dent, Since these are indeed defined as predetermined data for purposes of the model, the 
possibility of their mutually affecting one another is inherently ruled out. The procedure of 
partial differentiation adopted in Chap. 7 is therefore fully justifiable. 

However, no such expediency should be expected when, owing to the inclusion of gen- 
eral functions in a model, no explicit reduced-form solution can be obtained. In such cases, 
we will have to find the comparative-static derivatives directly from the originally given 
equations in the model, Take, for instance, a simple national-income model with two 
endogenous variables Yand C: 


Y=C+htG 
c=C(% TN) [To: exogenous taxes] 


which is reducible to a single cquation (an equilibrium condition) 
¥=C(Y, To) + ho + Go 


to be solved for ¥*. Because of the general form of the C function, however, no explicit 
solution is available. We must, therefore, find the comparative-static derivatives directly 
from this equation. Hew might we approach the problem? What special difficulty might we 
encounter? 

Let us suppose that an equilibrium solution Y* does exist. Then, under certain rather 
general conditions (to be discussed in Section 8.5}, we may take Y* to be a differentiable 


Chapter 8  Comparative-Statie Analysis uf General-Function Models 179 


function of the exogenous variables f, Go, and T), Hence we may write the equation 
¥* = ¥*(lo, Go, To) 


even though we are unable to determine explicitly the form which this function takes. 
Furthermore, in some neighborhood of the equilibrium value Y*, the following identical 
equality will hold: 


Yrac(”, h)+ht+Go 


This type of identity will be referred to as an equilibrium identity because it is nothing but 
the equilibrium condition with the Y variable replaecd by its equilibrium value ¥*. Now 
that Y* has entered into the picture, it may seem at first blush that simple partial differentia- 
tion of this identity will yield any desired comparalive-static derivative, say, 2Y*/4 7p. This, 
unfortunately, is not the case, Since Y* is a function of 7. the two arguments of the C fune- 
tion are not independent, Specifically, J) can in this case affect C not only directly, but also 
indirectly via Y*. Consequently, partial differentiation is no longer appropriate for our 
purposes, How, then, do we tackle this situation? 

The answer is that we must resort to fotal differentiation (as against partial differentia- 
tion). Based on the notion of foal differentials, the process of total differentiation can lead 
us to the related concept of total derivative, which measures the rate of change of a func- 
tion such as C(Y*, 7} with respeet to the argument 7), when 7p also affects the other 
argument, Y*. Thus, once we become familiar with these concepts, we shall be able to deal 
with functions whose arguments arc not all independent, and that would remove the major 
stumbling block we have so far encountered in our study of the comparative statics of a 
general-function model. As a prelude ta the discussion of these concepts, however, we 
should first introduce the notion of differentials. 





8.1 Differentials 


The symbol dy/dx, for the derivative of the function y = f(x), has hitherto been regarded 
as a single entity. We shall now reinterpret it as a ratio of two quantities, dp and dx. 





Differentials and Derivatives 
By definition, the derivative dy/dx = f'(x) is the limit of a difference quotient: 


dy , _ Ay 

= = lim — 8.1 

dx ro Asa Ax 61) 
‘Thus, by itself, Ay/Ax (without requiring Ax + 0) is not equal to dy/dx. If we denote 
the discrepancy between the two quotients by 5, we can write 


A dy 
SF 5 where «830 as Ax 0 [by(&) (82) 
Ax dx 
Multiplying (8.2) through by Ax, and rearranging, we have 
d 
ay= ae +3Ax or Ap=/f'(x)Av +d Ax (8.3) 


dx 


‘This equation describes the change in y (Ay) that results from a specific—not necessarily 
small—change in x (Ax) from any starling value of x in the domain of the function 


180 Part Three Comparative-Static Analysis 


y 














FIGURE 8.1 
¥ ya fe) 
D 

7 

dy 

4 at 

— dy ——| 
o ‘0 x 
(a) (by 


y = f(x). But it alse suggests that we can, by ignoring the discrepancy term 6 Ax, use the 
f(x) Ax term as an approximation to the true Ay value, where the approximation gets 
progressively better as Ax gets progressively smaller. 

In Fig, 8.14, when x changes from xp to xo + Ax, a movement from point 4 to point B 
occurs on the graph of y = f{x). The true Ay is measured by the distance CB, and the ratio 
of the two distances CB/AC = Ay/Ax can be read from the slope of line segment 48. But 
if we draw a tangent line 4D through point A, and use AD in place of 4B to approximate the 
value of Ay, we obtain distance CD, which leaves distance DB as the discrepancy or error 
of approximation, Since the slope of AD is f"(xp), distance CD is equal to f"(x) Ax and, 
by (8.3), distance DB is equal to 5 Ax. Obviously, as Ax decreases, point 8 would slide 
along the curve toward point 4, thereby reducing the discrepancy and making f"(x) or 
dy/dx a better approximation to Ay/ Ax. 

Focusing on the tangent line 4D, and taking the distance CD as an approximation to CB, 
Ict us relabel the distances AC and CD by dx and dy, respectively, as in Fig. 8.15. Thea 


dy 
—=sl ft tAD= f' 
x slope of tangent Fay 


and, after multiplying through by dx, we get 
dy = f'(x) dx (8.4) 


The derivative f’(x) can then be reinterpreted as the factor of proportionality between the 
two finite changes dy and dx. Accordingly, given a specific value of dx, we can multiply it 


Chapter 8 Comparative-Static Analysis of General-Function Models 181 


by /"(x) to get dy as an approximation to Ay, with the understanding that the smaller the 
Ax, the better the approximation. The quantities dv and dy are called the differentials of x 
and y, respectively. 

A few remarks are in order regarding differentials as mathematical entities. First, while 
dx is an independent variable, dy is a dependent variable. Specifically, dy is a function of x 
as well as of dx: It depends on x because a different position for xo in Fig, 8.1 would mean 
a different location for point 4 and for ils tangent line; it depends on ax because a different 
magnitude of dy would mean a different position for point C as well as a different distance 
CD. Second, if dx = 0, then dp = 0, because point 8 would in that case comcide with 
point 4. But if dx 4 0, then it is possible to divide dy by dx to get f(x}, just as we can 
multiply dx by /"(x) to get dy, Third, the differential @y can be expressed only in terms of 
some other differential(s)—here, dx, This is because our context calls for the coupling of a 
dependent change dy with an independent change dx. While it makes sense to write 
dv = f'(x) dx, it is not meaningful to chop away the dx term on the right and write 
dy = f(x). The coupling of the two changes is effected through the derivative f’(x), 
which may be viewed as a “converter” that serves to translate a given change dv into a 
counterpart change dy. 

The process of finding the differential dy from a given function y = f(x) is cailed 
differentiation, Recall that we have been using this term as a synonym for derivation, with- 
out having given an adequate explanation. In light of our interpretation of a derivative as a 
quotient of two differentials, however, the rationale of the term becomes self-evident, tt is 
still somewhat ambiguous, though, to use the single term “differentiation” to refer to the 
process of finding the differential dy as well as to that of finding the derivative dy /dx. To 
avoid confusion, the usual practice is to qualify the word differentiation with the phrase 
“with respect to x” when we take the derivative dy/dx. 


Differentials and Point Elasticity 
To illustrate the economic application of differentials, let us consider the notion of the elas- 
ticity of a function. Given a demand function O = f(/), for instance, its elasticity is 
defined as (AQ/Q)/(AP/P). Using the idea of approximation explained in Fig. 8.1, we 
can replace the independent change AP and the dependent change AQ with the differen- 
tials dP and dQ, respectively, to ge an approximation clasticity measure known as the point 
elasticity of demand and denoted by ey (the Greek letter epsilon, for “elasticity”):* 
eg = A2IQ _ dOHP (8.5) 
dP{P QO/P 

Observe that on the extreme right of the expression we have rearranged the differentials 
dQ and dP into a ratio dQ/dP, which can be construed as the derivative, or the marginal 
function, of the demand function G = f(?). Since we can interpret similarly the ratio 
Q/P in the denominator as the average function of the demand function, the point elastic- 
ity of demand ¢2 in (8.5) is sccn to be the ratio of the marginal function to the average fune- 
tion of the demand function. 


AQQ  AQ/AP 


+The point-elasticity measure can alternatively be interpreted as the limit of = — 
P y ¥ p ap/P ™ QIP 


AP — 0, which gives the same result as (8.5). 





182 Part Three Comparative-Static Analysis 


Example 1 


Example 2 


Indeed, this last-described relationship is valid not only for the demand function but also 
for any other function, because for any given foraf function y = f(x) we can write the 
formula for the point elasticity of y with respect to x as 

_ ady/dx marginal function 


bys = = 8.6 
% ylx average function @6) 





Asa matter of convention, the absolute value of the elasticity measure is used in decid- 
ing whether the function is clastic at a particular point. In the case of a demand function, 
for instance, we stipulate: 


elastic > 
The demand is | of unit elasticity | at a point when ley] = 1. 
inelastic 


Find ¢g if the demand function is Q= 100 — 2P. The marginal function and the average 
function of the given demand are 


dQ Q_ 100-2 
a 
40 their ratio will give us 
_ i? 
®t 50-P 


As written, the elasticity is shown as a function of P. As soon as a specific price is chosen, 
however, the point elasticity will be determinate in magnitude. When P = 25, for instance, 
we have ey = —1, or |eg| = 1, so that the demand elasticity is unitary at that point. When 
P = 30, in contrast, we have |eq| = 1.5; hence, demand is elastic at that price. More gen- 
erally, it may be verified that we have |eq| > 1 for 25 < P =< SOand leq| < 1 ford < P «25 
in the present example. (Can a price P > 50 be considered meaningful here?) 


Find the point elasticity of supply e, from the supply function Q = P2472, and determine 
whether the supply is elastic at P = 2. Since the marginal and average functions are, 
respectively, 


dQ Q_ 
gp a 2P +7 and put? 
their ratio gives us the elasticity of supply 


2P47 
P+?7 


When P = 2, this elasticity has the value 11/9 > 1; thus the supply is elastic at P = 2. 


f= 


At the tisk of digressing a trifie, it may also be added here that the interpretation of the 
ratio of two differentials as a derivative—and the consequent transformation of the elastic- 
ity formula of a function into a ratio of its marginal to its average—makes possible a quick 
way of determining the point elasticity graphically. The two diagrams in Fig. 8.2 illustrate 
the cases, respectively, of a negatively sloped curve and a positively sloped curve. In cach 
case, the value of the marginal function at point 4 on the curve, of at x = xo in the domain, 
is measured by the slope of the tangent line 4B. The value of the average function, on the 


FIGURE 8.2 


FIGURE 8.3 


Chapter 8 Comparative-Static Analysis of General-Function Models 183 

















a) th) 


other hand, is in each case measured by the slope of line O44 (the line joining the point of 
origin with the given point A on the curve, like a radius vector), because at point 4 we have 
y =XxoA and x = Oxg, so that the average is y/x = xyA/Oxo = slope of OA, The elas- 
ticity at point 4 can thus be readily ascertained by comparing the numerical values of the 
two slopes involved: If 48 is steeper than O4, the function is elastic at point 4; in the 
opposite case, it is inelastic at 4, Accordingly, the function pictured in Fig. 8.2a is inelastic 
at A (or atx = xg), whereas the one in Fig. 8.24 is clastic at 4. 

Moreover, the two slopes under comparison are directly dependent on the respective 
sizes of the two angles @,, and @, (Greek letter theta; the subscripts mt and a indicate mar- 
ginal and average, respectively). Thus we may, alternatively, compare these two angles in- 
stead of the two corresponding slopes. Referring to Fig. 8.2 again, you can see that 0, < 6, 
at point A in diagram a, indicating that the marginal falls short of the average in numerical 
value; thus the function is inelastic at point 4. The cxact opposite is true in Fig. 8.26. 

Sometimes, we are interested in locating a point of unitary elasticity on a given curve. 
This can now be done easily. [f the curve is negatively sloped, as in Fig. 8.34, we should 
finda point C such that the line OC and the tangent BC will make the same-sized angle with 
the x axis, though in the opposite direction. In the case of a positively sloped curve, as in 
Fig. 8.3, one has only to find a point C such that the tangent linc at C, when properly 
extended, passes through the point of origin. 


184 Part Three Comyarative-Static Analysis 


We must warn you that the graphical method just described is based on the assumption 
that the function y = f(x) is plotted with the dependent variable y on the vertical axis. In 
particular, in applying the method to a demand curve, we should make sure that Q is on the 
vertical axis. (Now suppose that Q is actually plotted on the horizontal axis, How should 
our method of reading the point elasticity be modified?) 





EXERCISE 8.17 


1. Find the differential dy, given: 
@ y= -x? +3) ) y= tr-BY7K45) Oya ay 

2. Given the import function M = f(Y), where M is imports and Y is national income, 
express the income elasticity of imports ¢ my in terms of the propensities to import. 

3, Given the consumption function C = a— bY (with a > 0.0 <b <1): 
(a) Find its marginal function and its average function. 


(b) Find the income elasticity of consumption ¢cy, and determine its sign, assuming 
Y>0. 
(2) Show that this consumption function is inelastic at all positive income levels. 
4, Find the point elasticity of demand, given Q=k/P", where k and n are positive 
constants. 
(a) Does the elasticity depend on the price in this case? 
(B) In the special case where n = 1, what is the shape of the demand curve? What is 
the point elasticity of demand? 
5. (a) Find a positively sloped curve with a canstant point elasticity everywhere on the 
curve. 
(b) Write the equation of the curve, and verify by (8.6) that the elasticity is indeed a 
constant. 
6. Given Q= 100 — 2P +0.02Y, where Q is quantity demanded, P is price, and Y is 
income, and given P = 20 and Y = 5,009, find the 
(a) Price elasticity of dernand, 
(8) Income elasticity of demand. 


8.2. Total Differentials 


The concept of differentials can easily be extended to a function of two or more indepen- 
dent variables. Consider a saving function 


S=S(Y,i) (8.7) 





where Sis savings, Y is national income. and jis the interest rate. This function ts assumed— 
as all the functions we shall use here will be assumed-—to be continuous and to possess 
continuous (partial) derivatives, or, symbolically, f € C’, The partial derivative aS /aY 
measures the marginal propensity to save. Thus, for any change in Y, dY, the resulting 
change in S can be approximated by the quantity (0S/4Y) d¥, which is comparable to the 
right-hand expression in (8.4), Similarly, given a change in ?, di, we may take (48/04) di 


Chapter 8 Comparutive-Staric Analysts of General-Fumction Models 185 


as the approximation to the resulting change in S. The total change in S$ is then approxi- 
mated by the differential 


as 
ady4 


aS oF a 


as 
di (8.8) 
fi 


or, in an alternative notation, 
dS = Sy d¥ +5; dé 


Note that the two partial derivatives Sy and S; again play the role of “converters” that serve 
to convert the changes dY and di, respectively, into a corresponding change @5, The ex- 
pression dS, being the swm of the approximate changes from both sources, is called the zotal 
differential of the saving function. And the process of finding such a total differential is 
called total differentiation. In contrast, the two additive components to the right of the 
equals sign in (8.8) are referred to as the partial differentials of the saving function. 

It is possible. of course, that ¥ may change while 7 remains constant. In that case, 
di =, and the tolal differential will reduce to dS = (aS /a¥) d¥. Dividing both sides by 


a, we get 
as (ds 
ay - d¥ f constant 


Thus it is clear that the partial derivative @S/dY can also be interpreted, in the spirit of 
Fig, 8,14, as the ratio of two differentials dS and dY, with the proviso that /, the other inde- 
pendent variable in the function, is held constant. Analogously, we can interpret the partial 
dcrivative 45/92 as the ratio of the differential dS (with Y held constant) to the differential 
di, Note that although @§ and di can now each stand alone as a differential, the expression 
38/3; remains as a single entity, 

The more general case of a function of # independent variables can be exemplified by, 
say, a utility function in the general form 





U = U(x, 2, ..., Xa) (8.9) 
The total differential of this function can be written as 
d= Be oy + aU bey + pe ae, 
ax] Ox OX, (8.10) 
a 
or dU =U, dx + Un dey +--+ Unda, = YU; dy 


i=l 


in which each term on the right side indicates the approximate change in U resulting from 
a change in one of the independent variables. Economically, the first term, U7; dx,, means 
the marginal utility of the first commodity times the increment in consumption of that com- 
modity, and similarly for the other terms. The sum of these, /U, thus represents the total 
approximate change in utility originating from all possible sources of change. As the rea- 
soning in (8.3) shows, dU, as an approximation, tends toward the true change AU as all the 
dx; terms tend to zero. 

Like any other function, the saving function (8.7) and the utility function (8.9) can both 
be expected to give rise to point-elasticity measures similar to that defined in (8.6). But each 


186 Part Three = Cumparutive-Static Analysis 


elasticity measure must in these instances be defined in terms of the change in one of the 
independent variables orily; there will thus be Avo such elasticity measures to the saving 
function, and » of them to the utility function. These are accordingly called partial elastic- 
ities. For the saving function, the partial clasticities may be written as 
as ¥ aSfai aS FE 
—= a5 and ég=wWrRypst 

S/¥ oY Ss Si aS 
For the utility function, the » partial elasticities can be concisely denoted as follows: 
aU x; 














Ga. = —— f=1,2,...." 
on =e U G ) 
Example 1 Find the total differential for the following utility functions, where a, b > 0: 
(a) UGxy, x2) = ax + bxz 
(b) UGG, 02) = 23 td + xe 
1% 
(Q) Ua, xa) = x3 
The total differentials are as follows: 
au aU 
(a) ge = ay 7 U2 b 
and 
dU = Uy dx, + Uz dx = a dx, +bdx 
au _ au, as 
(b) py Ue te Jy 7 U2 = PtH 
and 
GU = Uy dxy + Uz dx = (2a + 2) da + (399 +1) dep 
aU yale — IME WU pay 2 OE 
iG) an = Uy = axt xg = 7 an Uz = bxpxy = 
and 
dU = (oe ) dx; + (= dx 
x *2 
EXERCISE 8.2 


1. Express the total differential dU by using the gradient vector VU. 
2. Find the total differential, given 

(a) z= 3x? 4xy~2y? 

(B) U = 2x + 9x12 + 9 
3. Find the total differential, given 

x 2x1 X2 

Y= ay OY 
4. The supply function of a certain cornmodity is 

Q=a+bPr?+R'?  (a<0, b= 0) — [R: rainfall] 


Find the price elasticity of supply «gp, and the rainfall elasticity of supply ea. 








Chapter 8 Comparative-Static dnatvsis of General-Function Models 187 


5. How do the two partial elasticities in Prob. 4 vary with P and 8? In a strictly monotonic 
fashion (assuming positive P and R)? 

6. The foreign demand for our exports X depends on the foreign income ¥; and our price 
level P: X = yj? +? Find the partial elasticity of foreign demand for our exports 
with respect to our price fevel. 

7. Find the total differential for each of the following functions: 

(a) U = ~5x3 ~ 12xy ~ 6y° 

(b) U = 7x23 

(OU = 3x3(8x —7y) 

(d) U = (Sx? + 7y2x ~ 4y3) 
oy 

(U= v7 

(Ua 38 


8.3 Rules of Differentials 


A straightforward way of finding the total differential dy, given a function 





y= fu. x2) 
is to find the partial derivatives /; and /) and substitute these into the equation 
dv = fidxy t+ frdxe 


But sometimes it may be more convenient to apply certain rules of differentials which, in 
view of their striking resemblance to the derivative formulas studied before, are very easy 
to remember, 

Let & be a constant and # and v be two functions of the variables x; and x2. Then the 
following rules arc valid:* 


Rule [ dk=0 (ef. constant-function rule) 
Rule II d(cu") =enu" ' du (cf. power-function rule) 
Role IIT Autvy=dutdy (ef. sum-difference rule} 
Rule LV d(uv) = vdutudu (ef. product rule) 
1 
Rule V d (7) = s(v du—u dv) (cf. quotient rute) 
v wv 


Instead of proving these rules here, we shall merely illustrate their practical application. 


* all the rules of differentials discussed in this section are also applicable when u and v are themselves 
the independent variables (rather than functions of some other variables x; and x2). 


188 Part Three Comparutive-Static Analysis 


Example 1 Find the total differential dy of the function 
y= 5xP + 3x2 


The straightfoward method calls for the evaluation of the partial derivatives f = 10x and 
fy = 3, which will then enable us to write 


dy = fi dx + fa dxp = 10x; dxy + 3dxy 


We may, however, let u = 5x? and v= 3x2 and apply the previously given rules to get the 
identical answer as follows: 


ays (54) + d(x) [by Rule Ill] 
= 10xydx, + 3 dxz [by Rule Il] 


Example 2_ Find the total differential of the function 
~~ ys 3xp + xxF 
Since fy = 6x, +43 and fy = 2x1 x9, the desired differential is 
dy= (6m + 4) xy + 2X) X2 dx2 
By applying the given rules, the same result can be arrived at thus: 
dy= a(3x7) + a(n ) [by Rute Ill] 
=6x dn td tn (4) [by Rules Il and IV] 
= (6n + 4) dx +2xi% dxe [by Rule Il] 


Example 3. Find the total differential of the function 


_ Mm +% 
2x? 





In view of the fact that the partial derivatives in this case are 


A= “ae and f= x 
(check these as an exercise), the desired differential is 
_ ei t+2n) 
2x} 
However, the same result may also be obtained by application of the rules as follows: 
1 


dxy + ahd 


di 
¥ 2x? 


Wv= 5a [2rtatay +2) —(n +42) a(202)] [by Rule VI 
= paleten + da) — (x1 + x2)4%1 dx] [by Rules III and ti] 
‘1 
1 
= gal (x1 + 2xp) dbxy + 2x? | 


—(x1 + 2x2) 1 
= dn + 5x 
2} m 2? 


Chapter 8 Comparative-Static Analysis of General-Functian Mudelx 189 


These rules can naturally be extended to cases where more than two functions of x, and 
X2 are involved. In particular, we can add the following two rules to the previous collection: 


Rule VI d(iutvtw)sdutdvtdw 
Rule VII d(uvw) = vwdu + uwdut+uudw 
To derive Rute Vil, we can employ the familiar trick of first letting z = uw, so that 

d(uvw) = d(uz) =z du+udz [by Rule IV] 

Then, by applying Rule TV again to dz, we get the intermediate result 
dz = d(vw) = wdu+udw 
which, when substituted into the preceding equation, will yield 
d(uuw) = vw du +u(w du + dw) = ew du ~—uwde + ue dw 


as the desired final result, A similar procedure can be employed to derive Rule V1. 





EXERCISE 8.3 


1, Use the rules of differentials to find (a) dz from z= 3x? + xy —2y? and (®) dU from 
U = 2xy + 9xrx2 + x3. Check your answers against those obtained for Exercise 8.2-2. 
2. Use the rules of differentials to find dy from the following functions: 
x 2X4 Xp 
x x2 OF pe 
Check your answers against those obtained for Exercise 8.2-3. 
3. Given y = 3x1(2x2 — 1) + 5) 
(a) Find dy by Rule VII. 
(b) Find the differential of y, if dx = dx3 = 0. 
4, Prove Rules Il, fil, Fv, and V, assuming u and y to be the independent variables (rather 
than functions of some:other variables). 








(oy y= 


8.4 Total Derivatives 





We shall now tackle the question posed at the beginning of the chapter; namely, how can we 
find the rate of change of the function CCY*, Ty) with respect to Z, when Y* and 7 are 
related? As previously mentioned, the answer lies in the concept of total derivative. Unlike 
a partial derivative, a total derivative docs not require the argument ¥* to remain constant 
as Ty varies, and can thus allow for the postulated relationship between the two arguments. 


Finding the Total Derivative 
To carry on the discussion in a general framework, Ict us consider any function 


y= fiw) where x = aw) (8.11) 


190 Part Three Comparctive-Sratic Analysis 


FIGURE 8.4 


Example 1 









=— f | 


The two functions fand g can also be combined into a composite function 
y= f[g(), w] (8.11) 


The three variables v, x, and w are related to one another as shown in Fig. 8.4, in this figure, 
which we shall refer to as a channed map, it is clearly scon that w—the ultimate source of 
change—can affect y through two separate channels: (1) indirectly, via the function g and 
then f (the straight arrows), and (2) directly, via the function f (the curved arrow). The 
direct effect can simply be represented by the partial derivative f,. Bul the indirect effect 
dx ay 
can only be expressed by a product of two derivatives, Aa cares - , by the chain rule 
for a composite function. Adding up the two effects g gives us asthe desired total derivative of 


y with respect to w: 


= we is oy. (8.12) 
~ Ox dw dw 


This total derivative can also be obtained by an alternative method: We may first differenti- 
ate the function vy = f(x, w} totally, to get the total differential 
dy= f,dxs + fy dw 


and then divide through by dw. The result is identical with (8.12). Either way, the process 
of finding the total derivative dy/dw is referred to as the soral differentiation of y with 
respect fo W. 

It is extremely important to distinguish between the two look-alike symbols dy/dw and 
ay/dw in (8.12). The former is 4 total derivative, and the latter, a partial derivative. The 
latter is in fact merely a component of the former. 


Find the total derivative dy/dw, given the function 
y= f(x,w=3x-w? where x= gw) = 2w7 +wt4 
By virtue of (8,12), the total derivative should be 


a = 3(4w +1) +(-2W) = 10w+3 


Asa check, we may substitute the function g into the function f, to get 
y = 3(2w? + w+4)—w? = Sw? + 3w4 12 


which is now a function of walone. The derivative dy/dwis then easily found to be 10w + 3, 
the identical answer. 


FIGURE 8.5 


Example 2 


Example 3 


Chapter 8 Compurative-Static Analysis of General-Function Models 191 





If we have a utility function U = U(c, 5), where cis the amount of coffee consumed and s is 
the amount of sugar consumed, and another function s = g(¢) indicating the complemen- 
tarity between these two goods, then we can simply write the composite function 


U=Ul[e, gfe] 
from which it follows that 
du au au 
adc cf Bgl) iG) 
A Variation on the Theme 
The situation is only slightly more complicated when we have 


= : here | XE = 8O*) 

y= f(r. 42, W) where = AG) (8.13) 
The channel map will now appear as in Fig, 8.5, This time, the variable w can affect » 
through three channels: (1) indirectly, via the function g and then f, (2) again indirectly, via 
the function / and then f, and (3) directly via f From our previous experience, these three 


dy dx, ay dx2 oy 
yon oY = and By 





effects are expected to be expressible, respectively a8 
X] dw Ox. dw 


adding these together, we get the total derivative 
a ay dx dy dx, ay 
dy _ ay dey | ay dx, ap 
dw 8x, dw do dw dw 


+ fe (8.14) 














which is comparable to (8.12). If we take the lotal differential dy, and then divide through 
by dw, we can arrive at the same result. 


Let the production function be 

Q= AK, L,0 
where, aside fram the two inputs K and L, there is a third argument ¢, denoting time. The 
presence of the ¢ argument indicates that the production function can shift over time in 


reflection of technotogical changes. Thus this is a dynarnic rather than a static production 
function. Since capital and labor, too, can change over time, we may write 


K=k() and L=L(p 


192 Part Three = Cumparative-Static Analvsis 


Then the rate of change of output with respect to time can be expressed, in line with the 
total-derivative formula (8.14), as 


dQ _3QUK , Hadi , 99 

dt aK at © ak dt © at 
or, in an alternative notation, 

dQ 


a7 Qc K+ QL + OQ 


Another Variation on the Theme 


When the ultimate source of change, w in (8.13), is replaced by two coexisting sources, 
and v, the situation becomes the following: 


xy) = g(u,n) (8.15) 


y= fF (ei, X2, Uv) where ny = AC, v) 


While the channel map will now contain more arrows, the principle of its construction 
remains the same; we shall, therefore, leave it to you to draw. To find the total derivative of 
y with respect to u (while v is held constant), let us take the total differential of y, and then 
divide through by the differential du, with the result: 


dy avd | ay dx dydu | aydu 
du 8x, du" dx, du audu = avdu 


dy da dy oy dv 
= in i in = + 7 7 = O since v is held const 








In view of the fact that we are varying w while holding v constant (as a single derivative 
cannot handle changes in u and v both), however, the result obtained must be modified in 
two ways: (1) the derivatives dx /du and dxy/du on the right should be rewritten with the 
partial sign as 9.x) /8u and 8x2/2x, which is in line with the functions g and # in (8.15); and 
(2) the ratio dy/du on the left should also be interpreted as a partial derivative, oven 
though—being derived through the process of total differentiation of y it is actually in the 
nature of a total derivative. For this reason, we shall refer to it by the explicit name of 
partial total derivative, and denote it by §y/$u (with § rather than ), in order to distin- 
guish it from the simple partial derivative d¥/du which, as our result shows, is but onc of 
three component terms that add up to the partial total derivative.” 
With these modifications, our result becomes 
Sy ay dx) | Oy Ax. | dy 


Soe = 8.1 
Su ax, OH | AK) Gu Du (6.16) 


which is comparable to (8.14). Note the appearance of the symbol d¥/0u on the right, 
which necessitates the adoption of the new symbol §y/§u on the left to indicate the broader 


T An alternative way of clenoting this partial total derivative is 
df 
ay o = 


Uy constant dela 


Chapter 8 Comparutive-Statie Analysis of General-Function Models 193 


concept of a partial total derivative. In a perfectly analogous manner, we can derive the 
other partial total derivative, §y/§v. Inasmuch as the roles of w and v are symmetrical in 
(8.15), however, a simpler alternative is available to us. All we have to do to obtain §v/§u 
is to replace the symbol z in (8.16) by the symbol 1 throughout. 

The use of the new symbols §y/§u and §v/§v for the partial total derivatives, if uncon- 
ventional, serves the good purpose of avoiding confusion with the simple partial deriva- 
tives dy/du and dy/dv that can arise from the function falone in (8.15). However, in the 
special case where the f function takes the form of » = f(x), x} without the arguments 
wand y, the simple partial derivatives dy/du and Av/dv are nol defincd. Hence, it may not 
be inappropriate in such a case to use the Latter symbols for the partial total derivatives of v 
with respect to # and uv, since no confusion is likely to arise. Even in that event, though, the 
us¢ of a special symbol is advisable for the sake of greater clarity. 





Some General Remarks 

To conclude this section, we offer three general remarks regarding total derivative and total 

differentiation; 

1, In the cases we have discussed, the situation involves without exception a variable that 
is functionally dependent on a second variable, which is in turn dependent functionally 
on a third variable. As a consequence, the notion of a chain inevitably enters the picture, 
as cvidenced by the appearance of a product (or products) of two derivative expressions 
as the component(s) of a total derivative. For this reason, the total-derivative formulas in 
(8.12), (8.14), and (8.16) can also be regarded as expressions of the chain rule, or the 
compasite-function rule—a more sophisticated version of the chain rule introduced in 
Sec. 7.3. 

2. The chain of derivatives does not have to be limited to only two “links” (two derivatives 
being multiplied); the concept of total derivative should be extendible to cases where 
there are three or more links in the composite function. 

3. Inall cases discussed, total derivatives—including those which have been called partial 
total derivatives—measure rates of change with respect to some uitimaie variables in 
the chain or, in other words, with respect to certain variables which are in a sense 
exogenous and which are net expressed as functions of some other variables, The 
essence of the total derivative and of the process of total differentiation is to make 
due allowance for a// the channels, indirect as well as direct, through which the effects 
ofa change in an u/timare independent variable can possibly be carried to the particular 
dependent variable under study. 





EXERCISE 8.4 
1. Find the total derivative dz/dy, given 
(a) z= (x, y) = 5x-b xy, wherex = gy) = 3y" 
(b) z= 4x? — 3xy + 2)?, where x = 1/y 
(9 2=(x+ yx — 2y), where x= 2—7y 
2. Find the total derivative dz/at, given 
(o) z= x? — Bxy — y, where x = 3t and y=1-¢ 





194 Part Three Comparative-Static Analysis 


(b) z= 7u+ vt, where u = 20 andv=t+1 
(Q z= f(x,y, 0, where x =a—btand y=c+kt 

3. Find the rate of change of output with respect to time, if the production function is 
Q= A()K7L6, where A(t) is an increasing function of t, and K = Ko +at, and 
L= Lot ot. 

4, Find the partial total derivatives $W/§u and §W/§v if 
(a) W = ax? + bxy + cu, where X= au + fv and y= yu 
(8) W = F(x, x2), where x; = Su? + 3v and xp =u — 43 

5, Draw a channel map appropriate to. the case of (8.15). 


6. Derive the expression for §y/§v formally from (8.15) by taking the total differential of y 
and then dividing through by dv. 


8.5 Derivatives of Implicit Functions 





The concept of total differentials can also enable us to find the derivatives of so-called 
implicit functions. 


Implicit Functions 
A function given in the form of » = f(x), say, 


y= fa) = 3x4 (8.17) 


is called an explicit function, because the variable y is explicitly expressed as a function of 
x, If this function is written alternatively in the equivalent form 


yo3t=o (8.17’) 


however, we 0 longer have an explicit function. Rather, the function (8.17) is then only 
implicitly defined by the cquation (8.17’), When we are (only) given an cquation in the 
form of (8.17’), therefore, the function y = f(x) which it implies, and whose specific form 
may not even be known to us, is referred to as an implicit function. 

An equation in the form of (8.17') can be denoted in general by F(y, x} = 0, because 
its left side is a function of the two variables y and x. Note that we are using the capital ict- 
ter F here to distinguish it from the function f; the function F, representing the left-side 
expression in (8.17'), has two arguments, v and x, whereas the function /, representing the 
implicit function, has only one argument, x. There may, of course, be more than (wo argu- 
ments in the F function. For instance, we may encounter an equation F(y..4y,....4 Xn) = 0. 
Such an equation mary also define an implicit function y = /(x1,--.,%n)- 

The equivocal word may in the last scntence was used advisedly. For, whereas an explicit 
function, say, y = f(x), can always be transformed into an equation F(y, «) = 0 by sim- 
ply transposing the f(x) expression to the left side of the equals sign, the reverse transfor- 
mation is not always possible. Indeed, in certain cases, a given equation in the form of 
F(y,x) =0 may not implicitly define a function y = /(x). For instance, the equation 
x? + y? = 0 is satisfied only at the point of origin (0, 0), and hence yields no meaningful 
function to speak of. As another example, the equation 


Fy,x)ax? ty? —9=0 (8.18) 


FIGURE 8.6 


Chapter 8 Comparative-Static Analysis of General-Function Models 195 





yletyo x 
(upper hall) 








ery 
(circle) 





-3y -2 =I 12 33 
‘ ape i 
‘ / 
‘ , : 
\ i+ fy =-V9-7 
. ¢ (lower hall} 
oe +” 
3 





implies not a function, but a relation, because (8.18) plots as a circle, as shown in Fig. 8.6, 
so that no unique value of y corresponds to each value of x. Note, however, that if we 
restrict y to nonnegative values, then we will have the upper half of the circle only, and that 
does constitute a function, namely, y = +V9 — x2. Similarly, the lower half of the circle, 
with y values nonpositive, constitutes another function, y = — V9 — x2. In contrast, neither 
the left half nor the right half of the circle can qualify as a function. 

Tn view of this uncertainty, it becomes of interest to ask whether there arc known gen- 
eral conditions under which we can be sure that a given equation in the form of 


FY. X1y -. Xm) = 0 (8.19) 
does indeed define an implicit function 
v= FO im) (8.20) 


locally, i.¢., around some specific point in the domain. The answer to this lics in the 
so-called implicit-function theorem, which states that: 








Given (8.19), if (@) the function has continuous partial derivatives Py Pies Fm and if 
(B} at a point (yo, X10, .-..4mg) satisfying the equation (8.19), F, is nonzero, then there ox- 
ists an m-dimensional neighborhood of (x19, . -..%m0), Vin which y is an implicitly defined 
function of the variables ¥1,...,X., in the form of (8.20), This implicit function salisfies 

Jo = fra, ---.%mo). It also satistics the equation (8.19) for every m-tuple (x4... %pr) in 
the neighborhood N—thereby giving (8.19) the status of an identity in that ncighbothood. 
Moreover, the implicit function fis continuous and has continuous partial derivatives 


Sisveen Sane 


Let us apply this theorem to the equation of the circle, (8.18), which contains only one 
x variable. First, we can duly verify that F, = 2y and F, = 2y are continuous, as required, 
Then we note that F, is nonzero except when y = 0, that is, except at the leftmost point 
(—3, 0) and the rightmost point (3, 0) on the circle. Thus, around any point on the circle 
except (—3, 0) and (3, 0), we can construct a neighborhood in which the equation (8.18) 
defines an implicit function y = f(x). This is easily verifiable in Fig. 8.6, where it is indeed 


196 Part Three Comparative Static Analysis 


possible to draw, say, a rectangle around any point on the circle-- except (—3, 0) and 
(3, 0)—such that the portion of the circle enclosed therein will constitute the graph of a 
function, with a unique yp value for each value of x in that rectangle. 

Several things should be noted about the implicit-function theorem. First, the conditions 
cited in the theorem ate in the nature of sufficient (but not necessary) conditions. This 
means that if we happen to find #, = 0 at a point satisfying (8.19), we cannot use the the- 
orem to deny the existence of an implicit function around that point. For such a function 
may in fact exist (see Exercise 8,5-7}.° Second, even if an implicit function fis assured to 
exist, the theorem gives no cluc as to the specific form the function f takes. Nor, for that 
matter, does it tell us the exact size of the neighborhood N in which the implicit function is 
defined. However, despite thcse limitations, this theorem is one of great importance. For 
whenever the conditions of the theorem are satisfied, it now becomes meaningful to talk 
about and make use of a function such as (8.20), even if our model may contain an equa- 
tion (8.19) which is difficult or impossible to solve explicitly for » in terms of the x 
variables. Moreover, since the theorem also guarantees the existence of the partial deriva- 
tives fi..... fn, it is now also meaningful to talk about these derivatives of the implicit 
function. 


Derivatives of Implicit Functions 

If the equation Fy, x1,....4 Xn) = 0 can be solved for , we can explicitly write out the 
function » = f(x1,...,4m), and find its derivatives by the methods learned before. For 
instance, (8.18) can be solved to yield two separate functions 


vt=+J79—x? [upper half of circle] 


8.18" 
y =-V¥O—x? [lower half of circle] ‘ ) 
and their derivatives can be found as follows: 
dy~  d NID 4 De-1! 
= Sg — yl? = Lig — 23-2 
Tk a! x) 39 — (2x) 
= Gt £0) 
- 6.21) 
a U2) oe 149 — x2) V2} 
dx “2 ‘ " 


or #0) 





But what ifthe given equation, F(y, x), ..., Xm) = 0, cannot be solved for y explicitly? 
In this case, if under the terms of the implicit-function theorem an implicit function is 
known to exist, we can still obtain the desired derivatives without having to solve for y first. 
To do this, we make use of the so-called implicit-function rule—a rule that can give us 
the derivatives of every implicit function defined by the given equation. The development 
of this rule depends on the following basic facts: (1) if two expressions are identically 


t On the other hand, if Fy = 0 in an entire neighborhood, then it can be concluded that no implicit 
function is defined in that neighborhood. By the same token if Fy = 0 identically, then no implicit 
function exists anywhere, 


Chapter 8 Comparative-Static Analysis of Generul-Function Models 197 


equal, their respective total differentials must be equal;* (2) differentiation of an expres- 
sion that involves y,x;,...,%» will yield an expression involving the differentiats 
dy, dx),...,d%m; and (3) the differentiat of y, dy, can be substituted out, so the fact that 
we cannot solve for y does not matter, 

Applying these facts to the cquation F(¥, x1, ...,%m) = 0—which, we recall, has the 
status of an identity in the neighborhood N in which the implicit function is defined—we 
can write dF = 40, or 


Fy dy + Fy dxy + Fodxy test Fy dxy =0 (8.22) 
Since the implicit function y = f(x1, x2, ...,%m) has the total differential 
dy = fidx + frdxzt+.-+ fir dey 
we can substitute this dy expression into (8.22) to get (after collecting terms) 
(At Kdat(® ft A)dat--+¢Fy fat Fn) dim =9 (8.22) 


The fact that all the ex; can vary independently from one another means that, for the equa- 
tion (8.22'} to hold, each parenthesized expression must individually vanish; i.e., we must 
have 

Fi fit =0 — (foralli} 


Dividing through by F,, and solving for 7;, we obtain the so-called implicit-function rule 
for finding the partial derivative fj of the implicit function y = f(x), ¥2,...,%m): 





ay RK 
pean -l i=1,2,..., 23) 
mR m) (8.23) 
Jn the simple case where the given equation is F(yv, x) = 0, the rule gives 
dy _ oF 
> =-> 8.23’ 
ak (8.23') 


"Jake, for example, the identity 
PP alxt Wx-y) 
This is an identity because the two sides are equal for any values of x and y that one may assign. 
Taking the total differential of each side, we have 
d{left side} = 2x dx — 2y dy 
(right side) = (x—y) dx t+ y)+(xtyda-y) 
= (x — yd + dy) + OF yoda — dy} 
= 2x dx — 2y dy 
The two results are indeed equal. If two expressions are not identically equal, but are equal only 
for certain specific values of the variables, however, their total differentials will not be equal. The 
equation 
Paya pytnd 
for instance, is valid only for y= +1. The total differentials of the two sides are 
Aleft side) = 2x dx - 2y dy 
ad(right side) = 2x dx + 2y dy 
which are not equal. Note, in particular, that they are not equal even at y= 41. 


498 Part Three Comparative-Staric Analysis 


Example 1 


Example 2 


Example 3 


Example 4 


What this rule states is that, even if the specific form of the implicit function is not known 
to us, we can nevertheless find its derivative(s) by taking the negative of the ratio of a pair of 
partial derivatives of the F function which appears in the given equation that defines the im- 
plicit function. Observe that #, always appears in the denominator of the ratio, This being 
the case, it is not admissible to have #, = 0. Since the implicit-function theorem specifies 
that F, # 0 at the point around which the implicit function is defined, the problem ofa zero 
denominator is automatically taken care of in the relevant neighborhood of that pornt. 


Find dy/dx for the implicit function defined by (8.17'). Since F (y, x) takes the form of 
y — 3x4, we have, by (8.23'), 
dy fF, —12x3 


=123 


& FT 
In this particular case, we can easily solve the given equation for y to get y = 3x4, Thus the 
correctness of the derivative is easily verified. 


Find dy/dx for the implicit functions defined by the equation of the circle (8.18). This 
time we have F(y, x) =x? + y?—9; thus Fy = 2y and F, = 2x. By (8.23), the desired 
derivative is 

dy 2x x + 

dx yy (y#9 
Earlier, it was asserted that the implicit-function rule gives us the derivative of every implicit 
function defined by a given equation. Let us verify this with the two functions in (8.18’) and 
their derivatives in (8.21). If we substitute y+ for y in the implicit-function-rule result 
dy/dx = —x/y, we will indeed obtain the derivative dy’ /dx as shown in (8.21); similarly, 
the substitution of y~ for y will yield the other derivative in (8.21). Thus our earlier assertion 
is duly verified. 


Find @y/ax for any implicit function(s) that may be defined by the equation F(y, x, w) = 
yx? + w? + yxw — 3 = 0, This equation is not easily solved for y. But since Fy, Fx, and Fy 
are all obviously continuous, and since Fy = 3y” x? + xw is indeed nonzero at a point such 
as (1, 1, 1) which satisfies the given equation, an implicit function y = f(x, w) assuredly 
exists around that point at least. It is thus meaningful to talk about the derivative dy/ax. By 
(8.23), moreover, we can immediately write 


dy Fe _ Px byw 


ox Fy Bex XW 
At the point (1, 1, 1), this derivative has the value -j. 


Assume that the equation F(Q, K,£)=0 implicitly defines a production function Q = 
f(K, 1). Let us find a way of expressing the marginal physical products MPPx and MPP, in 
relation to the function F. Since the marginal products are simply the partial derivatives 
8Q/3K and 4Q/aL, we can apply the implicit-function rule and write 
§Q Fr ag Fy 
MPPy = =-—\ and) MpP, = = + 
aK Fa ‘al Fa 


t The restriction y £0 is of course perfectly consistent with aur earlier discussion of the equation 
(8.18) that follows the statement of the implicit-function theorem. 


Chapter 8  Comperutive-Statie Anatysis of General-Function Models 199 


Aside from these, we can obtain yet another partial derivative, 


aK Fy 

a Fe 
from the equation F(Q, K, L) = 0. What is the economic meaning of aK /aL? The partial 
sign implies that the other variable, Q, is being held constant; it follows that the changes in 
Kand 1 described by this derivative are in the nature of “compensatary” changes designed 
to keep the output Q constant at a specified level, These are therefore the type of changes 
pertaining to movernents along a production isaquant drawn with the K variable on the ver- 
tical axis and the { variable on the horizontal axis. As a matter of fact, the derivative aK /d£ 
is the measure of the siope of such an isoquant, which is negative in the normal case. The 
absolute value of aK /aL, on the other hand, is the measure of the marginal rate of technical 
Substitution between the two inputs. 


Extension to the Simultaneous-Equation Case 
The implicit-function theorem also comes in a more general and powerful version that 
deals with the conditions under which a set of simultancous equations 

Fl ee Mal My og tm) =O 

FP ee Sui Meo tn) =O (8.24) 











(8.25) 
Ya = POs + Xm) 
The generalized version of the theorem states that: 

Given the equation system (8.24), if (a) the functions F',..., F” all have continuous partial 
derivatives with respect to all the p and x variables, and if (f) ata point (119, ..., Yans 
Bees Xmy)} Satisfying (8.24), the following Jacobian detcrminant is nonzero: 

ar! aF! aF! 

ay ay, i 

art oF? af 

se 20 





ay ay avn 





or" arr ar 
ay ayy ay, 











+ To view it another way, what these conditions serve to do is to assure us that the 7 equations in 
(8.24) <an in principle be solved for the n variables—yi, ..., yn—even if we may not be able to obtain 
the solution (8.25) in an explicit form. 


200 Part Three  Comparative-Static Analysis 


then there exists an m-dimensional neighborhood of (rip, ..-. Xmo), NV, it which the variables 
Vise ces Jn ate functions of the variables x), ©... te in the form of (8.25). These implicit 
functions satisfy 





Joo = J"... ma) 


They also satisfy (8.24) fot every m-tuple (11, X¥w) in the neighborhood N thereby giv- 

ing (8.24) the status of a set of identities as far as this neighborhood is concerned. Moreover, 

the implicit functions f1, ..., f" are continuous and have continuous partial derivatives with 
respect to all the + variables. 





Asin the single-equation case, it is possible to find the partial derivatives of the implicit 
functions directly from the # equations in (8.24), without having to solve thet for the y 
variables, Taking advantage of the fact that, in the neighborhood 4, the equations in (8.24) 
have the status of identities, we can take the total differential of each of these, and write 
dFi =0(j =1,2,...,2). The result is a set of equations involving the differentials 
dy, -..,d¥y and dx, ..., dX». Specifically, after transposing the dx, terms to the right of 
the equals signs, we have 


oF! ar! oF! 


ar! ar! 
ay Ot ay ie t rot ana -(Z dxy tro bs dsr) 








ar? ar? ar? ar? ae 
> dy, + —dyz +--+» + ——dyy, --(3 dxj torts a) 
| LX, 

















Oy, Op, ax, OX ny (8.26) 
arr 4p or” Aya ape 
dy Pyy tot d= dt ad. 
ay ayy TT ay, ( a 7) 
Moreover, from (8.25), we can write the differentials of the y, variables as 
a ayy 
dy = tae + is? +. 
Ox) 
ee ayn ay 
dyy = ae —d. dX 
rasp dink dis boob gat (8.27) 
a. 
dy, = at tpt hdr tov + Ey 
. Oxy OX 


and these can be used to eliminate the dy) expressions in (8.26), But since the result of 
substitution would be unmanageably messy, let us simplify matters by considering only what 
would happen when x, alone changes while all the other variables x2,..., Xm remaifi 
constant. Letting dx; #0, bul setting dx, = ++» = dy, =0 in (8.26) and (8.27), then 


Chapter 8 Comparative-Sunic Analysis of General-Function Models 201 


substituting (8.27) into (8.26) and dividing through by dx, # 0, we obtain the equation system 


“(2 rc m2) + wf (2)- af! 
ay KOxy dy, \dxy ayn (ax Oxy 


aF? fay \ OF? fay, ap? (ee ar? 
: +- toot 7—[ 2" }=-- 
ayy \ ax ay. \axy Oyy \Oxy ax, (8.28) 


aF* fay \ BBY f ayy OF" (ay, ar" 
(E(B yy =-* 
an \axyJ a2 \ax, dy, (ax ax 


Even this result—for the case where x, alone changes—looks formidably complex, 
because it is full of derivatives. But its structure is actually quite easy to comprehend, once 
we learn to distinguish between the two types of derivatives that appear in (8.28). One type, 
which we have parenthesizcd for visual distinction, consists of the partial derivatives of the 
implicit functions with respect to x; that we are seeking. These, therefore, should be viewed 
as the “variables” to be solved for in (8.28). The other type, on the other hand, consists of the 
partial derivatives of the F/ functions given in (8.24). Since they would all take specific val- 
ues when evaluated at the point (y10,...5 ¥n03 X105.-+s %e9)—the point around which the 
implicit functions are defined they appear here not as derivative functions but as derivative 
values. As such, they can be treated as given constants. This fact makes (8,28) a linear equa- 
tion system, with a structure similar to (4.1). What is interesting is that such a linear system 
has arisen during the process of analysis of a problem that is not necessarily linear in itself, 
since no linearity restrictions have been placed on the equation system (8.24). Thus we have 
here an illustration of how linear algebra can come into play even in nonlinear problems. 

Being a linear equation system, (8.28} can be written in matrix notation as 












































ae! ar! ( ca ) _ oF! 
a Dy Oxy ax, 
ar ar? (=) oF 
na a, an} |= ax (8.28') 
ar’ ar aFr i) ray, apa 
ay, ay avy (=) “ax 


Since the determinant of the coefficient matrix in (8.28') is nothing but the particular 
Jacobian determinant |J| which is known to be nonzero under conditions of the implicit 
function theorem, and since the sysicm must be nonhomogencous (why?), there should be 
a unique nontrivial solution to (8.28'). By Cramer’s rule, this solution may be expressed 
analytically as follows: 
ay, FAL 
Yj fi F 
=— =1,2,...,4 see (5.18 8.29) 
(2) aU ) [see(5.18)] (8.29) 
By a suitable adaptation of this procedure, the partial derivatives of the implicit functions 
with respect to the other variables, x2, ..., Xm, can also be obtained. It is a nice feature of 
this procedure that, each time we allow a particular x; variable to change, we can obtain in 





202 Part Three = Cemparative-Static Analysis 


one fell swoop the partial derivatives of all the implicit functions f',..., ” with respect 
to that particular x; variable. 

Similarly, to the implicit-function rule (8.23} for the single-cquation case, the procedure 
just described calls only for the use of the partial derivatives of the F funetions—evaluated 
at the point (19, «.., Yad? X19- «5 %0)—in the calculation of the partial derivatives of the 
implicit functions f',..., £". Thus the matrix equation (8.28') and its analytical solution 
(8.29) are in effect a statement of the simultancous-equation version of the implicit- 
function rule, 

Note that the requirement |/| # 0 rules out a zero denominator in (8.29), just as the 
requirement F, #0 did in the implicit-function rule (8.23) and (8.23'). Also, the role 
played by the condition |./| # 0 in guaranteeing a unique (albeit implicit) solution (8.25) to 
the gencral (possibly nontinear) system (8.24) is very similar to the role of the nonsingu- 
larity condition |.4| # 0 ina finear system Ax = d. 





Example 5 The following three equations 
xy-w=0 Fla y,w;2=0 
y—-ws—3z=0 F2=(x, y,wi2=0 
w3 + 23 — 2z2w=0 FRaG, yw; 9 =0 
are satisfied at point P: (x, y, W; 2) = G4, 1,1), The F' functions obviously possess con- 
tinuous derivatives. Thus, if the Jacobian | /| is nonzero at point P, we can use the implicit- 
function theorem to find the comparative-static derivative (9x/92). 
To do this, we can first take the total differential of the system: 


ydx+xdy—dw=0 
dy — 3w? dw-3 dz=0 
Gw? — 22 dw + G2? - 2w) dz=0 


Moving the exogenous differential (and its coefficients) to the right-hand side and writing 
in matrix form, we get 


Ty x -1 dx 0 
1 -3w? dy {= 3 dz 
0 Gw2—22 || dw 2w — 322 


oo=< 


where the coefficient matrix on the left-hand side is the Jacobian 


FLY Fwl ly x “1 
WI=]FR FR FA) =]O 1 -3w? = yGw* - 27) 
FB FB FB; [0 0 (3w? — 22) 











At the point P, the Jacobian determinant |/ | = 4 (¢ 0). Therefore, the implicit-function rule 
applies and 





x) =| 3 
Oz 2w — 322 


Example 6 


Chapter 8 Comparutive Static Analysis of General-Function Models 203 


Using Cramer’s rufe to find an expression for (ax/#2), we obtain 























1 
0 x -1 04-1 
3 1 -3w? 31 ~3 
(%)- Qw-32? 0 (w'-29/_ [-1 0 1 
zs Wl ~ 4 
3c 4} -1 
ool 1-3 
0+(-3 a +(-1) 4 
3,2 
16 | 16 


Let the national-income model (7.17) be rewritten in the form 
¥Y-C—lo—Go=0 
C-w-plY—T)=0 
T-y-é8¥=0 


(8.30) 


If we take the endogenous variables (¥, C, T) to be {y1, ¥2, 43), and take the exogenous 
variables and parameters (Jo, Go, a, 8, y, 6) to be (x1, x2, ..., %6), then the left-side expres- 
sion in each equation can be regarded as a specific F function, in the form of FY, C, Ti ly 
Gy, a, 8, v, 8). Thus (8.30) is a specific case of (8.24), with n = 3 and m = 6. Since the func- 
tions F', F2, and F3 do have continuous partial derivatives, and since the relevant Jacobian 
determinant (the one involving only the endogenous variables), 








oF’ afl AF! 
ay aC aT 1-10 
af? af? aF? - 
Hi=|— => —-|=|-6 1 gl=1-p+68 (831) 
ay aC AF) |g | 
af aF? af? 
ay ac oF 


is always nonzero (both g and & being restricted to be positive fractions), we can take Y, C, 
and T to be implicit functions of (fo, Go, @, B, y, 5) at and around any point that satisfies 
(8.30). But a point that satisfies (8.30) would be an equilibrium solution, relating to ¥*, C* 
and T*. Hence, what the implicit-function theorem tells us is that we are justified in writing 


Y* = f'(Io, Go, a, By, 8) 

C= Fg, Go, a, fv, 8) 

T* = Flo, Go, a B, ¥, 8) 
indicating that the equilibrium values of the endogenous variables are implicit functions of 
the exogenous variables and the parameters. 

The partial derivatives of the implicit functions, such as dY*/d fo and a¥*/dGo, are in the 
nature of comparative-static derivatives. To find these, we need only the partial derivatives 
of the F functions, evaluated at the equilibrium state of the model. Moreover, since n= 3, 
three of these can be found in one operation. Suppose we now hold all exogenous variables 


204 


Part Three Comparative-Static Analysis 


and parameters fixed except Go. Then, by adapting the result in (8.28'), we may write the 


equation 
1-1 OFF a¥*/aGo 1 
—p 1 fl) actyaGa f=] 0 
-8 0 1] Lar*/3Go 0 


from which three comparative-static derivatives (all with respect to Go} can be calculated. 
The first one, representing the government-expenditure multiplier, wilt for instance come 
out to be 











1-1 0 
o 18 
aye Qo af 1 
uae tap YAN 


This is, of course, nothing but the result obtained earlier in (7.19), Note, however, that in 
the present approach we have worked only with implicit functions, and have completely 
bypassed the step of solving the system (8.30) explicitly for ¥*, C*, and 7°. [tis this par- 
ticular feature of the method that will now enable us to tackle the comparative statics of 
general-function models which, by their very nature, can yield no explicit solution. 





EXERCISE 8.5 


1. For each F(x,-y) = 0, find dy/dx for each of the failowing: 
(a) y~ 6x47 =O 
(Bb). 3y 4 laxt 17 =O 
(Q x +6x-13-y=0 
2. For each F(x, y) = 0 use the implicit-function rule to find dy/dx: 
(a) F(x, y) = 3x? 4+ 2xy + 43 =0 
(b) Fy ye 12x53 ~ 2y =0 
(0) F(x, y) = 7x? + Day? +97 = 0 
(a) F(X, y) = 6x? —3y = 0 
3. For each F(x, y, 2) = 0 use the implicit-function rule to find ay/9x and ay/az: 
@Flayy dap t 2 t+ayz=0 
(b) F(x, y, 2) = 827 + ft dxyz= 0 
© Fix, y, D237 4x22? + Pat + pz =0 
4, Assuming that the equation FCU, x1, x2,.-., Xn) =O implicitly defines a utility func- 
tion U = fxn, x2, -- 6) Xn)? 
{a) Find the expressions for aU /axz, AU /4xn, 8x3/Ax2, and x4 /IXn. 
{b) Interpret their respective economic meanings. 
5. For each of the given equations F (y, x) = 0, is an implicit function y = f(x) defined 
around the point (y = 3, x = 1)? 
(a) 3 ~ 2x2y + 3xy? -22=0 
(b) 2x? +4xy-¥ +67 =0 


Chapter 8 Compasarive-Statie Anafvsis of General-Funetion Models 205 


If your answer is affirmative, find dy/dx by the implicit-function rule, and evaluate it 
at the said point. 

6. Given x? 4 3xy4 2yz+ y? + 2? - 11 =0, is an implicit function z= f(x, y) defined 
around the point (x = 1, y = 2, 2=0)? if so, find dz/éx and az/ay by the implicit- 
function rule, and evaluate them at that point. 

7. By considering the equation Fy, x)=(x— y)? =0 ina neighborhood around the 
point of origin, prove that the conditions cited in the implicit-function theorem are 
notin the nature of necessary conditions. 

8. If the equation F(x, y,2=0 implicitly defines each of the three variables as a 
function of the other two variables, and if all the derivatives in question exist, find the 

az ax ay 
value of — — =. 
ax dy az 

9. Justify the assertion in the text that the equation system (8.28") must be nanhomo- 
geneous. 

10. From the national-income mode! £8.30), find the nonincome-tax multiplier by the 
implicit-function rule. Check your results against-(7.20). 


8.6 Comparative Statics of General-Function Models 





When we first considered the problem of comparative-static analysis in Chap. 7, we dealt 
with the case where the equilibrium values of the endogenous variables of the model are ex- 
pressible explicitly in terms of the exogenous variables and parameters. There, the tech- 
nique of simple partial differentiation was all we needed, When a model contains functions 
expressed in the general form, however, that technique becomes inapplicable because of the 
unavailability of explicit solutions. Instcad, a new technique must be employed that makes 
use of such concepts as total differentials, total derivatives, as well as the implicit-function 
theorem and the implicit-furction rule. We shall illustrate this first with a market model, 
and then move on to national-income models. 


Market Model 

Consider a single-commodity market, where the quantity demanded Qy is a function not 
only of price P but also of an exogenously determined income ¥y. The quantity supplied 
Q, on the other hand, is a function of price alone. If these functions are not given in 
specific forms, our model may be written generally as follows: 








Os = 0, 
Oe= D(PY)) — (AD/AP <0:9D/a¥, > 0) (8.32) 
QO, = SUPY (dS/dP > 0) 


Both the D and S functions are assumned to possess continuous derivatives or, in other 
words, to have smooth graphs. Moreover, in order to ensure economic relevance, we 
have imposed definite restrictions on the signs of these derivatives. By the restriction 
dS/dP > 0, the supply function is stipulated to be strictly inercasing, although il is per- 
mitted to be either linear or nonlinear. Similarly, by the restrictions on the two partial 
derivatives of the demand function, we indicate that it is a strictly decreasing function of 


206 Part Three Compararive-Static Analysis 


price but a strictly increasing function of income. For notational simplicity, the sign 
restrictions on the derivatives of a function arc sometimes indicated with + or — signs 
placed directly underneath the independent variables. Thus the /) and S functions in (8.32) 
may alternatively be presented as 


Qa = D(P,%) Qe = SCP) 
-+ + 


These restrictions serve to confine our analysis to the “normal” case we expect ta 
encounter. 

in drawing the usual type of two-dimensional demand curve, the income level is 
assumed to be held fixed. When income changes, it will upset a given equilibrium by caus- 
ing a shiff of the demand curve, Similarly, in (8.32), Yp can cause a disequilibrating change 
through the demand fonction. Herc, Yo is the only exogenous variable or parameter; thus 
the comparative-static analysis of this mode} will be concerned exclusively with how a 
change in Yy will affect the equilibrium position of the model. 

The cquilibrium position of the market is defined by the equilibrium condition 
Qa = Q,, which, upon substitution and rearrangement, can be expressed by 


D(P, Yo) — SP) = (8.33) 


Even though this equation cannot be solved explicitly for the equilibrium price P*. we shall 
assume that there docs exist a static equilibrium— for otherwise there would be no point in 
even raising the question of comparative statics. From our experience with specific- 
function models, we have learned to expect * to be a function of the cxogenous variable Yo: 


P* = P*(%) (8.34) 


But now we can provide a rigorous foundation for this expectation by appealing to the 
implicit-function theorem. Jnasmuch as (8.33) is in the form of FCP, Yo) = 0, the satisfac- 
tion of the conditions of the implicit-function theorem will guarantee that every value of 
will yield a unique value of P* in the neighborhood of a point satisfying (8.33), that is, in 
the neighborhood of an (initial or “old”) equilibrium solution. In that case, we can indecd 
write the implicit function P* = P*( Yo} and discuss its detivative. dP* /d Yo—the very 
comparative-static derivative we desire which is known to exist. Let us, therefore, check 
those conditions. First, the function F(P, Yo) indeed possesses continuous derivatives; 
this is because, by assumption, its two additive components D(P, ¥)) and S(P) have 
continuous derivatives, Second, the partial derivative of F with respect to P, namely, 
Fp = 8D/aP — dS/dP. is negative, and hence nonzero, no matter where it is evaluated. 
Thus the implicit-function theorem applies, and (8.34) is indeed legitimate. 

According to the same theorem, the equilibrium condition (8.33) can now be taken to be 
an identity in some neighborhood of the equilibrium solution. Consequently, we may write 
the equilibrium identity 

D(P*, Yy) ~ S(P*) = 0 [Excess demand = 0 in equilibrium] (8.35) 

FURS) 
It then requires only a straight application of the implicit-function rule to produce the 
comparative-static derivative, dP*/d¥q. For visual clarity, we shall from here on enclose 
comparative-static derivatives in parentheses (o distinguish them from the regular 


Chapter 8  Compurutive-Static Analysis of Generul-Function Models 207 


derivative expressions that merely constitute part of the model specification. The result 
from the implicit-function rule is 
(7) AF /a¥ aD/aYo 


ay) aF/aP*  BD/OP*—dS/aP* 9 636) 





In this result, the expression #.D/4P* refers to the derivative 8D/@P evaluated at the ini- 
tial equilibrium, ie, at P = P*; a similar interpretation attaches 10 S/d P*. In fact, 
0/8 ¥) must be evaluated at the equilibrium point as well. By virtue of the sign specifica- 
tions in (8,32}, (@P*/d Yq) is invariably positive. Thus our quaitafive conclusion is that an 
increase (decrease) in the income level will always result in an increase (decrease) in the 
equilibrium price. If the values which the derivatives of the demand and supply functions 
take at the initial cquilibrium are known, (8.36) will, of course, yield a quantitative con- 
clusion also. 

This di ion of market adjustment is concerned with the effect of a change in Yq on 
P* Is it possible also to find out the effect on the equilibrium quantity O'(= 0% = OF)? 
The answer is yes. Since, in the cquilibrium state, we have Q* = S(P*), and since 
P* = P*(¥,), we may apply the chain rule to get the derivative 


dor dS fdp* _ dS 
() = ap (5) =O ins ape? 0| (8.37) 
Thus the equilibrium quantity is also positively related to Yo in this model, Again, (8.37) 
can supply a quantitative conclusion if the values which the various derivatives take at the 
equilibrium are known. 

The results in (8.36) and (8.37), which exhaust the comparative-static contents of the 
model (since the latter contains only one exogenous and two endogenous variables), are not 
surprising. In fact, they convey no more than the proposition that an upward shift of the de- 
mand curve will result in a higher equilibrium price as well as a higher equilibrium quan- 
tity. This same proposition, it may seem, could have been arrived at in a flash from a sim- 
ple graphic analysis! This sounds correct, but one should not lose sight of the far, far more 
general character of the analytical procedure we have ysed here, The graphi¢ analysis is by 
its very nature limited lo a specific set of curves (the geometric counterpart of a specific set 
of functions); its conclusions are therelore, strictly speaking, relevant and applicable to 
only that set of curves. In sharp contrast, the formulation in (8.32), simplified as it is, cov- 
ets the entire set of possible combinalions of negatively sloped demand curves and posi- 
tively sloped supply curves. Thus it is vastly more general. Also, the analytical procedure 
used here can handle many problems of greater complexily that would prove to be beyond 
the capabilities of the graphic approach. 











Simultaneous-Equation Approach 

The analysis of model (8.32) was carried out on the basis of a single equation, namely, 
(8.35). Since only one endogenous variable can fruitfully be incorporated into ane equa- 
tion, the inclusion of P* means the exclusion of Q*. As a result, we were compelled to find 
(dP* /d Yq) first and then to infer (¢Q*/d Yo) in a subsequent step. Now we shall show how 
P* and Q* can be studied simultaneously. As there are two endogenous variables, we shall 


208 Part Three Comparutive-Static Analysis 


accordingly set up a two-equation system, First, letting Q = Qy = Q, in (8.32) and tear- 
ranging, we can express our market model as 
F'(P, Q: Yo) = DUP, Yo) - O=0 
FP, 0: %) = S(P)- 9 =0 
which is in the form of (8.24), with 2 = 2 and m = 1. Tt becomes of interest, once again, to 
check the conditions of the implicit-function theorem. First, since the demand and supply 
functions are both assumed to possess continuous derivatives, so must the lunetions F "and 
F*. Second, the endogenous-variable Jacobian (the one involving P and Q) indeed turns 
out to be nonzero, regardless of where it is evaluated, because 
ar! aR ap 
=| 8P 8O) ar _d8 8b | 
ar?) aF? ds 1 dP OP 
OP 90 dP 
Hence, if an equilibrium solution (P", Q*) exists (as we must assume in order to make it 
meaningful to talk about comparative statics), the implicit-function theorem tells us that we 
can write the implicit functions 


Pr=P(%) and 0 = OK) (8.40) 


(8.38) 





=I 
0 (8.39) 


even though we cannot solve for P* and Q* explicitly, These functions are known to have 
continuous derivatives. Moreover, (8.38) will have the status of a pair of identities in some 
neighborhood of the equilibrium stale, so that we may also write 


DP*,%)-O'=0 — fie, FCP, OF: Yo) = 0 
SOP) -Qr=0 fie, PUPS ON =U 





(8.41) 


From these, (@ P*/d¥o) and (d Q* /d Yo} can be found simultaneously by using the implicit- 
function rule (8.28'). 

Jn the present context, with #! and F? ag defined in (8.41), and with two endogenous 
variables P* and Q* anda single exogenous variable Yo, the implicit-function rule takes the 
specific form 











ar! aF! (Gr) af! 
ape 9OF d¥q aA 
ar? gR? do\ | | ar 
ape 3Q* (S) 8% 


Note that the comparative-static derivatives are written here with the symbol d rather than 
8, because there is only one exogenous variable in the present problem. More specifically, 
the last equation can be expressed as 


iP* 
apy ( ) aD 
apt a%) )_ | -s> 


as (2) 
ap dYy 














Chapter 8 Compararive-Static Analysis of Generat-Funetion Models 209 


By Cramer's rule, and using (8.39), we then find the solution to be 











aD 
“BY aD 

()- a 

a¥y fp lv] ~ |d| 
aD aD (8.42) 
aP* — a% 
a4 | asad 

(%)- dP* _ aP* aX 

d¥o ly] \J| 


where all the derivatives of the demand and supply functions (including those appearing in 
the Jacobian) are to be evaluated at the initial equilibrium. You can check that the results 
Just obtained are identical with those obtained earlier in (8.36) and (8.37), by means of the 
single-equation approach. 

Instead of ditectly applying the implicit-function rule, we can also reach the same result 
by first differentiating totally each identity in (8.41) in turn, to get a linear system of equa- 
tions in the variables dP* and dQ*: 


aD aD 
dP* —dQ* = -——d%q 
oP Q@ ay” 


dS 

——dP* —dQ* =0 

dP 2 
and then dividing through by d Y) # 0, and interpreting each quotient of two differentials 
as a derivative. 





Use of Total Derivatives 
In both the single-equation and the simultaneous-equation approaches illustrated above, we 
have taken the soral differentials of both sides of an equilibrium identity and then equated 
the two results to arrive at the implicit-function rule. Instead of taking the total differen- 
tials, however, it is possible to take, and equate, the ‘otal derivatives of the two sides of the 
equilibrium identity with respect to a particular exogenous variable or parameter. 

Tn the single-equation approach, for instance, the equilibrium identity is 

D(P*,¥) - SP*) = 0 [from (8.35)] 

where P*= P(%) — [from (8.34)] 
Taking the total derivative of the equilibrium identity with respect to ¥y—which takes into 
account the indirect as well as the direct effects of a change in ¥>—will therefore give us 


the equation 
oD {dP* 4 aD dS (dP* = 
apr \ d¥o a¥o dP*\ d¥%) 
(iii s) ( direct a) (et net) 
of Yon D of Yon D of Fy on S 


When this is solved for (d P* /d Yo), the result is identical with the one in (8.36). 





210 Part Three Comparative-Static Analysis 


FIGURE 8.7 ic 
D function = 





In the simultaneous-equation approach, on the other hand, there is a pair of equilibrium 


identities: 
D(P*, Yo) - O* =O 
S(P*)- Qt =0 [from (8.41)] 
where P*= PY) O* = O*(No) [from (8.40)] 


The various effects of Yp are now harder to keep track of, but with the help of the channel 
map in Fig. 8.7, the pattern should become ctear. This channel map tells us, for mstance, 
that when differentiating the D function with respect to Yo, we must allow for the indirect 
effect of ¥ upon D through P*, as well as the direct effect of Yo (curved arrow). In differ- 
entiating the S function with respect to Yo, on the other hand, there is only the indirect effect 
(through P*) to be taken into account. Thus the result of totally differentiating the two iden- 
tities with respect to Yo is, upon rearrangement, the following pair of equations: 


ap (aP\ (dQ) aD 
aP* \d¥y d¥y} aN 


dS (dPt aa") _4 

dP* Vay d¥y) 

These are, of course, identical with the equations obtained by the total-differential method, 
and they lead again to the comparative-static derivatives in (8,42). 


National-Income Model (IS-LM) 
A typical application of the implicit-function theorem is a general-functional form of the 
IS-LM model.t Equilibrium in this macroeconomic model is characterized by an income 
level and interest rates that simultaneously produce equilibrium in both the goods market 
and the money market. 

A goods market is described by the following set of equations: 


YaC+I+G CH=CIY-T) G=Gy 
fal(ry T=T(Y) 

















Y is the level of gross domestic product (GDP), or national income. In this form of the 
model, Y can also be thought of as aggregate supply. C, /, G, and 7 are consumption, 
investment, government spending, and taxes, respectively. 


115 stands for “investment equals savings” and LM stands for “liquidity preference equals money 
supply.” 


Chapter 8 Comparutive-Staric Analvsis of General-Punction Modely. 211 


1. Consumption is assumed to be a strictly increasing function of disposable income 
(¥ — T). If we denote disposable income as Y“ = ¥ — T, then the consumption func- 
tion can be expressed as 

c=C(¥") 
where dC/d¥¢ = C’(Y4) is the marginal propensity to consume (0 < C’(¥4) < 1). 

2, Tavestment spending is assumed to be a strictly decreasing function of the rate of 

interest, r: 
df 

—=Nry<0 

rake 


Ir 


3. The public sector is described by two variables: government spending (G} and taxcs (7). 
Typically, government spending is assumed to be exogenous (set by policy) whereas taxes 


are assumed to be an increasing function of income. a= T'(Y) is the marginal tax 
rate (0 < TY) <1), 
If we substitute the functions for CZ, G into the first equation Y = C + / + G, we get 
Y=C(Y -T(Y)) +10) 4+ Go (IS curve) 
which gives us a single equation with two endogeneous variables: Y and r. This equation 


gives us all the combinations of ¥ and r that produce equilibrium in the goods market. 
This equation implicitly defines the ES curve, 


Slope of the IS Curve 
If we rewrite the [S equation, which is in the nature of an equilibrium identity, 
¥-C(Y*%) — 10) — Go =0 
then the total differential with respect to Y and r is 
aY —COY1- 1) d¥ —Lrydr =0 
dy? 
Note: WT 1-7'ty) 
We can rearrange the ¢¥ and dr terms to get an expression for the slope of the IS curve: 
de 1-e(yv tl - ray . 


dy T(r) 0 


Given the restrictions placed on the derivatives of C, Z, and 7, we can easily verify that the 
slope of the IS curve is negative. 
The money market can be described by the following three equations: 


M4 = E(¥,r) [money demand] where Ly >0 and L, <0 
M‘= Mj, [money supply} 


where the money supply is assumed to be exogenously determined by the central monetary 
authority, and 


Mt = (equilibrium condition] 


212 Part Three Conparutive-Statie Analysis 


Substituting the first two equations into the third, we get an expression that implicitly 
defines the LM curve, which is again in the nature of an equilibrium identity. 


Lr) = M§ 


Slope of the LM Curve 
Since this is an equilibrium identity, we can take the total differential with respect lo the 
two endogenous variables, Y and r: 
Ly d¥+1,dr=0 

which can be rearranged to give us an expression for the slope of the LM curve 

dr iky 

dy”, 
Since Ly > O and L, < 0, we can determine that the slope of the LM curve is positive. 

The simultaneous macroeconomic equilibrium state of both the goods and moncy mar- 

kets can be described by the following system of equations: 


>o 


Y=SC(VO4+ I) 4 Gy 
L(Y fr) = Me 


which implicitly define the two endogenous variables, Y and r, as functions of the exoge- 
nous variables, Gy and Af}. Taking the total differential of the system, we get 


a¥ — CWO - 1) d¥ — ry dr =dGo 
by dV +L, dr =d My 


or, in matrix form, 


Lc l-ray) -2@ ] [ar] _ [Go 
Ly Ly dy | | dM 


The Jacobian determinant is 


1c - ray) -iry 


ll -| i, 1. 





= (L-COM - TL + byl) <9 
Since |J| ¢ 0, this system satisfies the conditions of the implicit-function theorem and 
the implicit functions 
y" = Y*(Go, M§) 
and 
r* =r"(Go. Ms) 
can be written even though we are unable to solve for Y* and r* explicitly. Even though 
we cannot solve for ¥* and r* explicitly, we can perform comparative-static exercises to 


determine the effects of a change of one of the exogenous variables (Gu, Mj} on the equi- 
librium values of ¥* and *. Consider the comparative-static derivatives dY*/dGy and 


Chapter 8 Comparative-Statie Analysis of General-Function Madels 213 


dr*/8Gq which we shall derive by applying the implicit-function theorem to our system of 
total differentials in matrix form 


[!- comm ran -lr) | [ay] _ |] dG 
Ly Ly dr | LdM 


First we set dM) = 0 and divide both sides by dGp. 








dy* 
1-C-d-7) -Fr)]| 4@o | fi 
Ly L, dre | [0 
dGy 
Using Cramer’s rule, we obtain 
1 -r 
dy* OL, Ly 
— = > 0 
qGy A 





and 


dr* | Ly 0 | _ oly 
dG FI ~ ld 
From the implicit-function theorem, these ratios of differentials, d¥*/dGy and dr*/dGo, 
can be interpreted as partial derivatives, 
#¥"(Go, Ma) ae*(Go, Mg) 
and 
aGy IG 

which are our desired comparative-static derivatives, 











Extending the Model: An Open Economy 


One property of a model that economists look for is its robustness; the ability of the model 
to be applied to different settings. At this point we will extend the basie model to ineorpo- 
rate the forcign sector. 


1. Nef exports. Let_X denote exports, Af denote imports, and £ denote the exchange rate 
(measured as the domestic price of foreign currency). Exports are an increasing function 
of the exchange rate. 


X=X(E) where XE) > 0 


Imports are a decreasing function of the exchange rate but an increasing function of 
income. 


M= M(Y, E) where My >0, Mg <0 


2. Capital flows, The net flow of capital into a country isa function ef both the domestic in- 
terest rate + and world interest rate +, Let K denote net capital inflow such that 


K=Krr,) where K,> 0, K. <0 


214 Part Three Comiparartive-Static Analysis 


3. Balance of payments. The inflows and outflows of foreign currency for a country are 
typically separated into two accounts: current account (net exports of goods and ser- 
vices) and the capital account (the purchasing of foreign and domestic bonds). Together, 
the two accounts make up the balance of payments. 

BP = current account + capital account 
=[X(E)- MY, EV 4+ K@ 0) 
Under flexible exchange rates, the exchange rale adjusts to keep the balance of pay- 
ments equal to zero. Having the balance of payments equal to zeto is the equivalent to say- 
ing the supply of foreign currency equals the demand for foreign currency by a country.! 


Open-Economy Equilibrium 
Equilibrium in an open economy is characterized by threc conditions: aggregate demand 
equals aggregate supply; the demand for money equals the supply of money; the balance of 
payments equals zcro. Adding the forcign sector to our basic model gives us the following 
system of three equations 
¥ =C(V) + Hr) + Got X(E) — MOY) 
LO) = Mi 
X(E)— M(Y, £) + K(r,7y) = 0 
Since we have three equations, we need three endogenous variables, which are ¥, r, and 
E. The exogenous variables now become Gy, Mj, and r,. Rewriting the system as equilib- 
rium identities F! = 0, F? = 0, F? =0 allows us to find the Jacobian: 
¥-—C(Y") — ir) — Go -— X(B) + M(Y, Ey = 0 
LU, r) — M3 =0 
X(E) — MY, E)+ Kr) =8 
1-C'--P)4+My - Mp —X' 
JJ] = Ly by 0 
—My K, X'- Mp 
Using Laplace expansion down the third column, we obtain 
Ly Ly 1-C--T) +My -I' 
—My K, Ly Ly 
= (Mp — XL Ky + bp My) +X — MeV -—Co(L- P+ My IL, + by) 
= (Mp — XWby(K, — 1) + £,[C' — Ty — Uy} 
Given the assumptions about the signs of the partial derivatives and the restriction that 
0<C'-{1—7”) < |, wecan determine that |/| < 0, Therefore, we can write the implicit 
functions 


|J| = (Mg — X9 +(X' — Mz) 











¥* = ¥*(Go, Mire) 
= r*(Go, Most) 
Et = E(Go, Mire) 


t Under a fixed exchange rate regime, the balance of payments is not necessarily zero. in such an 
event, any surpluses or deficits are recorded as change of official settlements. 


Chapter 8 Comparative-Static Analysis of General-Function Models 215 


Taking the total differential of the system of equations and writing it in matrix form 


1-C-(-T)4My -I' My —X') Fayt dGo 
Ly L, 0 dr |=| aM 
—My K, X'-Mp| [aE “K,, dry 


will allow us to carry out a series of comparative-static exercises. Let’s consider the impact 
of change in the world with interest rates 7,, on the equilibrium values of Y, r, and Z. Set- 
ting dGy = dM) = 0 and dividing both sides by dr, gives us 





ay* 
dry 
1-C- UP) +My <1 Me] | gs 0 
ly L, 0 =. |= 6 
—My K, X'—-Me || —Kry 
dE* 
dr, 


Using Cramer's rule, we obtain the comparative-static derivatives 


0 -I Me-X 











0 OL, 0 
ay*  |-Kny, Ke X’-Me|  (—K,.\(—Le Me —X') 0 
= SOP 
an, \J| ld] 
and 
1-CoU-T)4+My 0 Mp—X" 
Ly 0 0 
art —My TK A'~ Mel K,{-Ly)(Me - X’) 0 
ar, JI * VI * 
and 
1-C(l-T)+My -l 0 
Ly L, 0 
ag” -My K, -Kiy 
ane lJ 


HK = C1 (1= 1) 4 Melb t Lal) 


0 
WI 





At this point you should compare the results we have derived to the macroeconomic 
principles. Intuitively, a rise in the world interest rate should lead to an increase in capital 
outflows and a depreciation of the domestic currency, This, in turn, will lead to an increase 
in net exports and income, The increase in domestic income will cause an increase in 
money demand, putting upward pressure on domestic interest rates. This result is illustrated 
graphically in Fig. 8.8 where a rise in world interest rates leads to a rightward shift of the 
TS curve. 


216 Part Three Comparative-Static Analysis 


FIGURE 8.8 











Summary of the Procedure 

In the analysis of the general-function market model and national-income model, it is not 
possible to obtain explicit solution values of the endogenous variables. Instead, we rely on 
the implicit-function theorem to enable us to write the implicit solutions such as 


Pt = P*(¥) and =o r* = r*(Go, Mp) 


Our subsequent search for the comparative-static derivatives such as (dP*/d Yo} and 
(3r*/9Gp) then rests for its meaningfulncss upon the known fact—thanks again to the 
implicit-function theorem-- that the P* and r* functions do possess continuous derivatives. 

To facilitate the application of that theorem, we make it a standard practice to write the 
equilibrium condition(s) of the mode] in the form of (8.19) or (8.24), We then check 
whether (1) the F function(s) have continuous derivatives and (2) the value of F, or the 
endogenous-variable Jacobian determinant (as the case may be} is nonzero at the initial 
equilibrium of the model. However, as long as the individual functions in the model have 
continuous derivatives- an assumption which is often adopted as a matter of course in 
general-function models—the first condition is automatically satisfied. As a practical mat- 
ter, therefore, it is needed only to check the value of F, or the endogenous-variable 
Jacobian. And :fit is nonzero at the equilibrium, we may proceed at once to the task of find- 
ing the comparative-stalic derivatives. 

To that end, the implicit-function rule is of help. For the single-equation case, simply set 
the endogenous variables equal to its equilibrium value (c.g., set P = P*) in the equilib- 
rium condition, and then apply the rule as stated in (8.23) to the resulting equilibrium iden- 
tity. For the simultancous-equation casc, we must also first set all endogenous variables 
equal to their respective equilibrium values in the equilibrium conditions. Then we can 
either apply the implicit-function rule as illustrated in (8.29) to the resulting equilibrium 
identities, or arrive at the same result by carrying out the several steps outlined as follows: 


]. Take the total differential of each equilibrium identity in turn. 
2. Select one, and only one, exogenous variable (say, Xo) as the sole disequilibrating fac- 
tor, and set the differentials of all other exogenous variables equal to zero. Then divide 


Chapter 8 Comparative-Static Analysis of Generai-Function Models 217 


all remaining terms in each identity by ¢Xq, and interpret cach quotient of two differ- 
entials as a compatative-static derivative—a partial one if the model contains two or 
more exogenous variables.” 


. Solve the resulting cquation system for the comparative-static derivatives appearing 


therein, and interpret their cconomic implications. In this step, if Cramer’s rule is used, 
we can take advantage of the fact that, earlier, in checking the condition |/| 4 0, we 
have in fact atready calculated the determinant of the coefficient matrix of the equation 
system now being solved, 


. For the analysis of another disequilibrating factor (another exogenous variable), if any, 


repeat steps 2 and 3. Although a different group of comparative-static derivatives will 
emerge in the new equation system, the coefficient matrix will be the same as before. 
and thus the known value of |/[ can again be put to usc. 


Given a model with 7 exogenous variables, it will take exactly a applications of steps 1, 2, 
and 3 to catch all the comparative-static derivatives there are. 


* Instead of taking steps 1 and 2, we may equivalently resort to the total-derivative method by 
differentiating (both sides of) each equilibrium identity totally with respect to the selected exogenous 
variable. In so doing, a channel map will prove to be of help. 





EXERCISE 8.6 


1, Let the equilibrium condition for national income be 


SY) 4+ TOY) = 104) + Go (iT fed 84+ oF) 

where 5, ¥, TJ, and G stand for saving, national income, taxes, investment, and gov- 

ernment expenditure, respectively. Alf derivatives are continuous, 

{a) Interpret the economic meanings of the derivatives 5’, T’, and {*. 

(6) Check whether the conditions of the implicit-function theorem are satisfied. If so, 
write the equilibrium identity, 

(©) Find (dY*/dGo) and discuss its economic implications. 


. Let the demand and supply functions for 4 cormmodity be 


Qa = D(P,%) (Dp < 0; Dy, > 0) 

Qs = S(P, To) (Sp>0; Sy, <9) 

where Yo isincome and 7 is the tax on the commodity. All derivatives are continuous. 

{a} Write the equilibrium condition in a single equation. 

(b) Check whether the implicit-function theorem is applicable. If so, write the equilib- 
rium identity. 

(Q Find (0P*/a¥o) and (4 P*/ 7), and discuss their economic implications. 

(d) Using a procedure similar to (8.37), find (8 Q"/3¥y) from the supply function and 
{3Q°/4Tp) from the demand function. (Why not use the demand function for the 
former, and the supply function for the latter?) 


. Solve Prob, 2 by the simuttaneous-equation approach. 
. Let the demand and supply functions for a commodity be 


Qy= DP, to) and Qs = Qso 


218 Part Three Companutive-Stutic Amelvsis 


where t, is consumers’ taste for the commodity, and where both partial derivatives are 

continuous. 

(a) What is the meaning of the — and + signs beneath the independent variables P and 
ty? 

(b) Write the equilibrium condition as a single equation. 

(c) ts the implicit-function thearem applicable? 

(d) Haw would the equilibrium price vary with consumers’ taste? 

5. Consider the following national-income model (with taxes ignored): 
Y—C(Y}-H)-GoeO O<C<1; I <9) 
kY+L@)-Mo=0  (k=positive constant; £' < 0) 

(a) |s the first equation in the nature of an equilibrium condition? 

(6) What is the total quantity demanded for money in this model? 

(© Analyze the comparative statics of the model when money supply changes 
(monetary policy) and when government expenditure changes {fiscal paticy). 

6. in Prob. 5, suppose that while the demand for money still depends on Yas specified, it 

is now no longer affected by the interest rate. 

(a) How should the model statement be revised? 

(b) Write the new Jacobian, call it |/ |’. Is {{ / numerically (in absolute value) larger or 
smaller than |} |? 

(¢) Would the implicit-tunction rule still apply? 

{d) Find the new comparative-static derivatives. 

(e) Comparing the new (@¥*/3Gq) with that in Prob. 5, what can you conclude about 
the effectiveness of fiscal policy in the new model where Y is independent of i? 

(f) Comparing the new (8¥*/3M,9) with that in Prob. 5, what can you say about the 
effectiveness of monetary policy in the new model? 


8.7 Limitations of Comparative Statics 





Comparative statics is a useful area of study, because in economics we are often interested 
in finding out how a disequilibrating change in a parameter will affect the equilibrium state 
of a model, It is important to realize, however, that by its very nature comparative statics 
ignores the process of adjustment from the old equilibrium to the new and also neglects the 
length of time required in that adjustment process. As a consequence, it must of nccessity 
also disregard the possibility that, because of the inherent instability of the model, the new 
equilibrium may not be attainable ever, The study of the process of adjustment per se be- 
longs to the field of economic dynamics. When we come to that, particular attention will be 
directed toward the manner in which a variable will change over time, and explicit consid- 
eration will be given to the question of stability of equilibrium. 

The important topic of dynamics, however, must wait its turn. Meanwhile, in Part 4, we 
shall undertake to study the problem of optimization, an exceedingly important special 
variety of equilibrium analysis with attendant comparative-static implications (and compli- 
cations) of its own. 





Part 
Optimization 4 
Problems 


Chapter 


220 





Optimization: A Special 
Variety of Equilibrium 
Analysis 


When we first introduced the term equilibrium in Chap. 3, we made a broad distinction 
between goal and nongoal equilibrium. In the latter type, exemplified by our study of mar- 
ket and national-income models, the interplay of certain opposing forces in the model— 
e,g., the forces of demand and supply in the market models and the forces of leakages and 
injections in the income models—dictates an equilibrium state, if any, in which these 
opposing forces are just balanced against each other, thus obviating any further tendency 
to change. The attainment of this type of equilibrium is the outcome of the impersonal bal- 
ancing of these forces and does not require the conscious effort on the part of anyone to 
accomplish a specified goal. True, the consuming households behind the forces of demand 
and the firms behind the forees of supply are each striving for an optimal position under 
the given circumstances, bul as far as the market itself is concerned, no one is aiming at 
any particular equilibrium price or equilibrium quantity (unless, of course, the govern- 
ment happens to be trying to peg the price). Similarly, in national-income determination, 
the impersonal balancing of leakages and injections is what brings about an equilibrium 
state, and no conscious effort at reaching any particular goal (such as an attempt to alter 
an undesirable income level by means of monetary or fiscal policies) needs to be involved 
at all. 

In the present part of the book, however, our attention will be turned to the study of gaal 
equilibrium, in which the equilibrium state is defined as the optimum position for a given 
economic unit (a houschold, a business firm, or even an entire economy) and in which the 
said economic unit will be deliberately striving for attainment of that equilibrium, As a 
result, in this context— but only in this context--our earlier warning that equilibrium dacs 
not imply desirability becomes irrelevant and immaterial. In this part of the book, our pri- 
mary focus will be on the classical techniques for locating optimum positions—those using 
differential calculus. More modern developments, known as mathematical programming. 
will be discussed in Chap. 13, 





Chapter 9 Optimization: 4 Special Variety of Equilibrium Analysis 221 


9.1 Optimum Values and Extreme Values 





Economics is essentially a science of choice. When an economic project is to be carried 
out, such as the production of a specified level of output, there are normally a number of 
alternative ways of accomplishing it. One (or morc) of these alternatives will, however, be 
more desirable than others from the standpoint of some criterion, and it is the essence of 
the optimization problem to choose, on the basis of that specified critcrion, the best alter- 
native available, 

The most common criterion of choice among alternatives in economics is the goal of 
maximizing something (such as maximizing a firm’s profit, a consumer's utility, or the rate 
of growth of a firm or of a country’s cconomy) or of minimizing something (such as mini- 
mizing the cost of producing a given output), Economically, we may categorize such max- 
imization and minimization problems under the general heading of optimization, meaning 
“the quest for the best.” From a purely mathematical point of view, however, the terms max- 
imum and minimum do not carry with them any connotation of optimality. Therefore, the 
collective term for maximum and minimum, as mathematical concepts, is the more matter- 
of-fact designation extremum, meaning an extreme value. 

In formulating an optimization preblem, the first order of business is to delineate an 
objective function in which the dependent variable represents the object of maximization 
or minimization and in which the set of independent variables indicates the objects whose 
magnitudes the economic unit in question can pick and choose, with a view to optimizing. 
We shall therefore refer to the independent variables as choice variables.’ The essence of 
the optimization process is simply to find the sct of values of the choice variables that will 
lead us to the desired extremum of the objective function. 

For example, a business firm may seck to maximize profit , that is, to maximize the dif- 
ference between total revenue 2 and total cost C. Since, within the framework of a given 
state of technology and a given market demand for the firm’s product, & and C are both 
functions of the output level Q, it follows that x is also expressible as a function of O: 


#(Q) = RQ) — C(Q) 

This equation constitutes the relevant objective function, with a as the object of maxi- 
mization and Q as the (only) choice variable. The optimization problem is then that of 
choosing the level of O that maximizes 2. Note that while the optimal level of m is by 
definition its maximal level, the optimal level of the choice variable Q is itself not required 
to be cither a maximum or a minimum. 

To cast the problem into a more general mold for further discussion (though still con- 
fining ourselves to objective functions of one variable only). let us consider the general 
function 





p= f(x) 
and attempt to develap a procedure for finding the level of. that will maximize or minimize 
the value of y. It will be assumed in our discussion that the function f is continuously 
differentiable. 


+ They can also be called decision variables, or policy variables, 


222 Part Four Optimization Problems 


9.2 Relative Maximum and Minimum: First-Derivative Test 





FIGURE 9.1 


Since the objective function y = (2) is stated in the general form, there is no restriction as 
to whether it is linear or nonlinear or whether it is monotonic or contains both increasing and 
decreasing parts. From among the many possible types of function compatible with the 
objective-function form discussed in Sec. 9.1, we have selected three specific cases to be 
depicted in Fig. 9.1. Simple as they may be, the graphs in Fig. 9. should give us valuable in- 
sight into the problem of locating the maximum or minimum value of the function y = J(x). 


Relative versus Absolute Extremum 

If the objective function is a constant function, as in Fig. 9.1a, all values of the choice 
variable x will result in the same value of y, and the height of each point on the graph of 
the function (such as 4 or B or C) may be considered a maximum or, for that matter, a 
minimum—or, indeed, neither. In this case, there is in effect no significant choice to be 
made regarding the value of x for the maximization or minimization of y. 

In Fig. 9.13, the function is strictly increasing, and there is no finite maximum if the set 
of nonnegative real numbers is taken to be its domain, However, we may consider the end 
point D on the left (the y intercept) as representing a minimum; in fact, it is in this case the 
absolute (or global) minimum in the range of the function. 

The points E and F in Fig. 9.1c, on the other hand, are examples of a relative (or loca!) 
extremum, in the sense that each of these points represents an extremum in the immediate 
neighborhood of the point only. The fact that point F is a relative minimum is, of course, no 
guarantee that it is also the global minimum of the function, although this may happen to 
be the case. Similarly, a relative maximum point such as £ may or may not be a global max- 
imum. Note also that a function can very well have several relative extrema, some of which 
may be maxima while others are minima. 

In most economic problems that we shall be dealing with, our primary, if not exclusive, 
concem will be with extreme values other than end-point values, for with most such prob- 
lems the domain of the objective function is restricted to be the set of nonnegative real 
numbers, and thus an end point (on the left) will represent the zero level of the choice vari- 
able, which is often of no practical interest. Actually, the type of function most frequently 
encountered in economic analysis is that shown in Fig, 9.lc, or some variant thereof that 
contains only a single bend in the curve. We shall therefore continue our discussion mainly 
with reference to the search for relative extrema such as points E and F. This will, however, 
by no means foreclose the knowledge of an absolute maximum if we want it, because an 
absolute maximum must be either a relative maximum or one of the end points of the 








y y y 
E 
B c 
A 
D 
F 
0 x o x oO x 


(a) (®) ©) 


Chapter 9 Optimization: A Special Variety of Equilibrium Analysis 223 


FIGURE9.2 2 


G) 











{a} {8} 


function. Thus if we know all the relative maxima, it is necessary only to select the largest 
of these and compare it with the end points in order to determine the absolute maximum. 
The absolute minimum of a function can be found analogously. Hereafter, the extreme val- 
ues considered will be relative or local ones, unless indicated otherwise. 


First-Derivative Test 

As a matter of terminology, from now on we shall refer to the derivative of a function 
alternatively as its first derivative (short for first-order derivative), The reason for this will 
become apparent shortly. 

Given a function y = f(x), the first derivative /’(x) plays a major role in our search for 
its extreme values, This is due to the fact that, ifa relative extremum of the function occurs 
at x = xp, then either (1) /"(xo) does not exist, or (2) f’{xp) = 0, The first eventuality is 
illustrated in Fig. 9.2a, where both points 4 and 8 depict relative extreme values of y, and 
yet no derivative is defined at either of these sharp points. Since in the present discussion 
we are assuming that » = f(x) is continuous and possesses a continuous derivative, how- 
ever, we are in effect ruling out sharp points. For smooth functions, relative extreme values 
can occur only where the first derivative has a zero value. This is illustrated by points C and 
Pin Fig, 9.24, both of which represent extreme values, and both of which are characterized 
by a zero slope—f"{x1) = 0 and f’(x;) = 0. It is also easy to see that when the slope is 
nonzero we cannot possibly have a relative minimum (the bottom of a valley) or a relative 
maximum (the peak of a hill). For this reason, we can, in the context of smooth functions, 
take the condition f"(x) = 0 to be a necessary condition for a relative extremum (either 
maximum or minimurn). 

We must hasten to add, however, that a zero slope, while necessary, is not sufficient to 
establish a relative extremum. An example of the case where a zcro slope is not associated 
with an extremum will be presented shortly. By appending a certain proviso to the zero- 
slope condition, however, we can obtain a decisive test for a relative extremum. This may 
be stated as follows: 


First-derivative test for relative extremum [f the first derivative of a function f(x) at 
x = Xx is f(xo) = 0, then the value of the function at xo, f{xo}, will be 


a. A relative maximum if the derivative f(x) changes its sign from positive to negative 
from the immediate left of the point x9 to its immediate right. 


224 Part Four Optimization Problems 


FIGURE 9.3 


b. Arelative minimum if f'(x) changes its sign from negative to positive from the imme- 
diate left of xq to its immediate right. 

c. Neither a relative maximum nor a relative minimum if /"(x) has the same sign on both 
the immediate left and the immediate right of point x9. 


Let us call the value xq a critical value of x if f’(vo) = 0, and refer to f(xo} as a sta- 
tionary value of y (or of the function f). The point with coordinates xo and (xo) can, 
accordingly, be called a stationary point. (The rationale for the word stationary should be 
self-evident—wherever the slope is zero, the point in question is never situated on an 
upward or downward incline, but is rather at a standstill position.) Then, graphically, the 
first possibility listed in this test will establish the stationary point as the peak of hill, such 
as point D in Fig. 9.2b, whereas the second possibility will establish the stationary point as 
the bottom of a valley, such as point C in the same diagram. Note, however, that in view of 
the existence of a third possibility, yet to be discussed, we are unable to regard the condi- 
tion f(x) = 0 as a sufficient condition for a relative extremum. But we now see that, ifthe 
necessary condition /’(x) = 0 is satisfied, then the change-of-derivative-sign proviso can 
serve as a sufficient condition for a relative maximum or minimum, depending on the 
direction of tue sign change. 

Let us now explain the third possibility. In Fig. 9.3a, the function f is shown to attain 
a zero slope at point J (when x = j). Even though f’(j) is zero—which makes f(j) a 
stationary value—the derivative does not change its sign from one side of x = j to the 
other; therefore, according to the first-derivative test, point / gives neither a maximum nor 

















y J y= st) 
(+) 
K 
f 
Hf}! 
\ 
| 
o x o ti x 
@ (by 
& i a i 
ax ! a | 
a 
1 
jae fe) | Be gay 
i \ dy 
I 
1 | 
1 1 
ly ! 
o x QO x 


Example 1 


Chapter 9 Optimization: Special Variety of Equilibrium Analysis 225 


a minimum, as is duly confirmed by the graph of the function. Rather, it cxemplifies what 
is known as an inflection point. 

The characteristic feature of an inflection point is that, at that point, the derivative (as 
against the primitive) function reaches an extreme value. Since this extreme value can be 
either a maximum or a minimum, we have two types of inflection points. In Fig. 9.30’, 
where we have plotted the derivative ’(x), we see that its value is zoro when x = j (see 
point J’) but is positive on both sides of point J’; this makes J‘ a minimum point of the 
derivative function f“(x). 

The other type of inflection point is portrayed in Fig. 9.35, where the slope of the func- 
tion g(x) increases till the point k is reached and decrcases thereafter. Consequently, the 
graph of the derivative function g'(x) will assume the shape shown in Fig. 9.36', where 
point K’ gives a maximum value of the derivative function g’(x).* 

To sum up: A relative extremum must be a stationary value, but a stationary value may 
be associated with either a relative extremum or an inflection point. To find the relative 
maximum or minimum of a given function, therefore, the procedure should be first to find 
the stai: onary values of the function where the condition /“(x) = 0 is satistied, and then to 
apply the first-derivative test to determine whether each of the stationary values is a relative 
maximum, a relative minimum, of neither. 


Find the relative extrema of the function 
y= f@)=x3 — 127 4 36x48 
First, we find the derivative function to be 
f(x) = 3x? = 24x 436 


To get the critical values, i.e., the values of x satisfying the condition f’(x) = 0, we set the 
quadratic derivative function equal to zero and get the quadratic equation 


3x? — 24x + 36=0 


By factoring the polynomial or by applying the quadratic formula, we then obtain the 
following pair of roots (solutions): 


x7=6 — [at which we have f'(6) = 0 and (6) = 8] 
xy3=2 [at which we have f(2) =Q and f(2) = 40] 


Since f'(6) = f'(2) = 0, these two values of x are the critical values we desire, 


It is easy to verify that, in the immediate neighborhood of x = 6, we have f'(x) < 0 far 
x <6, and f'(x) > 0 for x > 6; thus the value of the function /(6) = 8 is a relative min- 
imum. Similarly, since, in the immediate neighborhood of x = 2, we find f'(r) > 0 for 
x <2, and f(x) <0 for x > 2, the value of the function f(2) = 40 is a relative 
maximum. 


* Note that a zero derivative value, while a necessary condition for a relative extremum, is sot 
required for an inflection point; for the derivative g'(x) has a positive value at x = x, and yet point Kis 
an inflection point. 


226 Part Four Opsimization Problems 


FIGURE 9.4 


Example 2 





A 12st + Jor + 8 


x 











Figure 9.4 shows the graph of the function of this example. Such a graph may be used 
to verify the location of extreme values obtained through use of the first-derivative test. 
But, in reality, in most cases “helpfulness” flows in the opposite direction—the mathemat- 
ically derived extreme values will help in plotting the graph, The accurate plotting of a 
graph ideally requires knowledge of the value of the function at every point in the domain; 
but as a matter of actual practice, only a few points in the domain are selected for purposes 
of plotting, and the rest of the points typically are filled in by interpolation. The pitfall of 
this practice is that, unless we hit upon the stationary point(s) by coincidence, we shal] miss 
the exact location of the turning point(s) in the curve. Now, with the first-derivative test at 
our disposal, it becomes possible to locate these turning points precisely. 


Find the relative extremum of the average-cost function 
AC = MQ) = Q?-5Q48 


The derivative here is f'(Q) = 2Q — 5, alinear function. Setting f'(Q) equal to zero, we get 
the linear equation 2Q — 5 = G, which has the single root Q* = 2.5. This is the only critical 
value in this case. To apply the first-derivative test, let us find the values of the derivative 
at, say, Q=24 and Q= 246, respectively. Since f'(2.4) = -0.2 < 0 whereas f'(2.6)= 
0.2 > 0, we can conclude that the stationary value AC = f(2.5) = 1,75 represents a relative 
minimum, The graph of the function of this example is actually a U-shaped curve, so that 
the relative minimum already found will also be the absolute minimum. Our knowledge of 
the exact location of this point should be of great help in plotting the AC curve. 





EXERCISE 9.2 


1. find the stationary values of the following (check whether they are relative maxima or 
minima or inflection points), assuming the dornain to be the set of all real numbers: 


(@) y=—2x2 48x47 (b) y=Sx? =x (Cl y= 3x7 43 (d) y= 3x? 6x42 


Chapter 9 Optimization: 4 Special Variety of Eyuilibrium Analysis 227 


2. Find the stationary values of the following (check whether they are relative maxima or 
minima or inflection points), assuming the domain to be the interval (0, 20): 
(@) y= - 3x45 
(b) y= 4-2 42410 
(Q y= —x3 +4.5x? — 6x46 
3. Show that the function y= x+1/x (with x 40) has two relative extrema, one a 
maximum and the other a minimum. Is the “minimum” larger or smaller than the 
“maximum? How is this paradoxical result possible? 
4, Let T = (x) be a tofal function (e.g, total product or total cost): 
(a) Write out the expressions for the marginal function M and the average function A. 
(b) Show that, when A reaches a relative extremum, M and A must have the same 
value. 
(Q What general principle does this suggest for the drawing of a marginal curve and 
an average curve in the same:diagram? 
(a) What can: you conclude about the elasticity of the total function T at the point 
where’A reaches.an extreme value? 


9.3, Second and Higher Derivatives 





Hitherto we have considered only the first derivative f’(x) of a function y = f(x); now let 
us introduce the concept of second derivative (short for second-order derivative), and 
derivatives of even higher orders. These will enable us to develop alternative criteria for 
locating the relative extrema ofa function. 


Derivative of a Derivative 
Since the first derivative /"(x) ts itself'a function of x, it, too, should be differentiable with 
Tespect to x, provided that it is continuous and smooth. The result of this differentiation, 
known as the second derivative of the function f, is denoted by 
LR) where the double prime indicates that f(r) has been differentiated with 
respect to x twice, and where the cxpression (x) following the double 
prime suggests that the second derivative is again a function of x 


or 


where the notation stems from the consideration that the second derivative 





ad fdy > 

means, in fact, ik (Gr): hence, the d? (read: “d-two”) in the numerator 
ae as 

and dx? (read: “ex squared”) in the denominator af this symbol. 


Ifthe second derivative /"(x) exists for all x values in the domain, the function (x) is said 

to be twice differentiable; if, in addilion, f”(.c) is continuous, the function f(x) is said to 

be twice continuously differentiable. Just as the notation f € C" or f € C” is often used 

to indicate that the function fis continuously differentiable, an analogous notation 
fee? oo fec” 


can be used to signify that f is twice continuously differentiable. 


228 Part Four Optimization Problems 


Example 1 


Example 2 


AS a function of x the second derivative can be differentiated with respect to x again to 
produce a third derivative, which in turn can be the source of a fourth derivative, and so on 
ad infinitum, as long as the differentiability condition is met, These higher-order derivatives 
are symbolized along the same line as the second derivative: 

f°), OO), ..., £&) [with superscripts enclosed in ( J] 
Py aty dty 

or Sap 
dx¥* dx4 dx" 

n 2 


d 
The last of these can also be written as rea where theo part serves as an operator 
x 


symbol instructing us to take the nth derivative of (some function) with respect to x. 

Almost all the specific functions we shall be working with possess continuous deriva- 
tives up to any order we desire; i.¢., they are continuously differentiable any number of 
times. Whenever a general function is used, such as f(x), we always assume that it has 
derivatives up to any order we need. 


Find the first through the fifth derivatives of the function 
y= f(x) = 4x4 — 2 4170? 43x-4 
The desired derivatives are as follows: 
F(x) = 16x? — 3x? + 34x43 
f"(x) = 48x? — 6x +34 


f'"(x) = 96x — 6 
f(xy = 96 
fay =0 


In this particular (polynomial) example, we note that each successive derivative function 
emerges as a lower-order polynomial—from cubic to quadratic, to linear, to constant. We 
note also that the fifth derivative, being the derivative of a constant, is equal to zero for all 
values of x; we could therefore have written it as f(x) =0 as weil, The equation 
F(x) =0 should be carefully distinguished from the equation F(x) = 0 (zero at x 
only). Also, understand that the statement £)(x) = 0 does not mean that the fifth deriva- 
tive does not exist; it indeed exists, and has the value zero, 


Find the first four derivatives of the rational function 
x 
= =—— +1 
yeGX)= 75 #1) 


These derivatives can be found either by use of the quotient rule, or, after rewriting the 
function as y = x(1+x)~*, by the product rule: 
g@=04+x7 
9) = -20 +073 
g(x) = 6 + x4 
g(x) = —24(1 + x98 
In this case, repeated derivation evidently does not tend to simplify the subsequent deriva- 
tive expressions. 


&A-1) 


Chapter 9 Optimization: A Special Variety of Equitibrium Analysis 229 


Note that, like the primitive function g(x), all the successive derivatives obtained are 
themselves functions of x. Given specific values of x, however, these detivative functions 
will then take specific values. When x = 2, for instance, the second derivative in Example 2 
can be evaluated as 

3 2 
(2) = -23) > = — 
g'Q) (3) W 
and similarly for other values of x. It is of the utmost importance to realize that to evaluate 
this second derivative g(x) at x = 2, as we did, we must first obtain g(x) from g’(x) and 
then substitute x = 2 into the equation for g”{x}. It is incorrect to substitute x = 2 into 
g(x) or g'(x) prior to the differentiation process leading to g(x). 


interpretation of the Second Derivative 

The derivative function f'(x) measures the rate of change of the function f By the same 
token, the second-derivative function f” is the measure of the rate of change of the first 
derivative f"; in other words, the second derivative measures the rate of change of the rate 
of change of the original function f: To put it differently, with a given infinitesimal increase 
in the independent variable x from a point x = x9, 


f(a) > 0 
f(a) <0 
whereas, with regard to the second derivative, 


F'(%o) > 0 
F(x) <0 


increase 


| means that the value of the function tends to | dee 





increase 


means that the sfepe of the curve tends to | 
decrease 


Thus a positive first derivative coupled with a positive second derivative at x = xo 
implies that the slope of the curve at that point is positive and increasing. In other words, 
the value of the function is increasing at an increasing rate, Likewise, a positive first deriv- 
ative with a negative second derivative indicates that the slope of the curve is positive but 
decreasing—the value of the function is increasing at a decrcasing rate. The case of a neg- 
ative first derivative can be interpreted analogously, but a warning is in order in this case: 
When f*(xo) <Q and f"(x9) > 0, the slope of the curve is negative and increasing, but 
this does not mean that the slope is changing, say, from (—10) to (—11): on the contrary, the 
change should be from (—11), a smaller number, te (— 10), a larger number. In other words, 
the negative slope must tend to be /ess steep as x increases. Lastly, when /’(xo) < 0 and 
J" (xo) < 0, the slope of the curve must be negative and decreasing. This refers to a nega- 
tive slope that tends to become steeper as x increases. 

All of this can be further clarified with a graphical explanation. Figure 9.5q illustrates a 
function with f(x) < 0 throughout. Since the slope must steadily decrease as x increases 
on the graph, we will, when we move from left to right, pass through a point 4 with a pos- 
itive slope, then a point B with zero slope, and then a point C with a negative slope. It may 
happen, of course, that a function with f"(x) <0 is characterized by f'(x) > 0 every- 
where, and thus plots only as the rising portion of an inverse U-shaped curve, or, with 
f(x) < 0 everywhere, plots only as the declining portion of that curve. 

The opposite case of a function with f"(x) > 0 throughout is illustrated in Fig. 9.5b. 
Here, as we pass through points D to E to F, the slope steadily increases and changes from 


230 Part Four Optimization Problems 


FIGURE 9.5 








pQ@---------- 


! 
\ 
\ 
\ 
\ 
\ 
\ 
\ 
\ 
' 
4 
a4 





negative to zero to positive. Again, we add that a function characterized by f“(x) > 0 
throughout may, depending on the first-derivative specification, plot only as the declining 
or the rising portion of a U-shaped curve. 

From Fig. 9.5, it is evident that the second derivative f”(x) telates to the curvature of a 
graph; it determines how the curve tends to bend itself. To describe the two types of differ- 
ing curvatures discussed, we refer to the one in Fig, 9.5a as strictly concave, and the one in 
Fig, 9.5b as strictly convex. And, understandably, a function whose graph is strictly concave 
(strictly convex) is called a strictly concave (strictly convex) function. The precise geomet- 
tic characterization of a strictly concave function is as follows. If we pick any pair of points 
Mand N onits curve and join them by a straight line, the line segment MN must lie entirely 
below the curve, except at points M and N. The characterization of a strictly convex func- 
tion can be obtained by substituting the word above for the word below in the last statement. 
Try this out in Fig, 9.5. If the characterizing condition is relaxed somewhat, so that the line 
segment MN is allowed to lie either below the curve, or along (coinciding with} the curve, 
then we will be describing instead a concave function, without the adverb strictly. Simi- 
larly, if the line segment MN either lies above, or lies along the curve, then the function is 
cortvex, again without the adverb strictly. Note that, since the line segment MN may coin- 
cide with a (nonstrictly) concave or convex curve, the latter may very well contain a linear 
segment, In contrast, a strictly concave or convex curve can never contain a linear segment 
anywhere, It follows that while a strictly concave (convex) function is automatically a con- 
cave (convex) function, the converse is not true.! 

From our earlier discussion of the second derivative, we may now infer that if the sec- 
ond derivative f(x) is negative for all x, then the primitive function f(x) must be a strictly 
concave function. Similarly, f(x) must be strictly convex, if f"(x) is positive for all x. 
Despite this, it is not valid to reverse this inference and say that, if f(x) is strictly concave 
(strictly convex), then /”(x) must be negative (positive) for all x. This is because, in certain 
exceptional cases, the second derivative may have a zero value at a stationary point on such 
acurve, An example of this can be found in the function y = f(x) = x4, which plots as a 
strictly convex curve, but whose derivatives 


faa f(x) = 12° 


1 We shall discuss these concepts further in Sec. 11.5. 


Chapter 9 Optimization: A Special Variety of Equitibrium Anatysis 231 


indicate that, at the stationary point where x = 0, the value of the second derivative is 
(0) = 0. Note, however, that at any other point, with x 4 0, the second derivative of this 
function does have the (expected) positive sign. Aside from the possibility of a zero value 
at a stattonary point, therefore, the second derivative of a strictly concave or convex func- 
tion may be expected in general to adhere to a single algebraic sign. 

For other types of function, the second derivative may take both positive and negative 
values, depending on the value of x. In Fig. 9.3a and 4, for instance, both f(x) and g(x) 
undergo a sign change in the second derivative at their respective inflection points / and K. 
According to Fig. 9.32’, the slope of f’(x)—that is, the value of f"(x)-—changes from 
negative to positive at x = 7; the exact opposite occurs with the slope of g’(x)—that is, the 
value of ¢”(x)—0n the basis of Fig, 9.35’. Translated into curvature terms, this mcans that 
the graph of f(x) turns from strictly concave to strictly convex at point /, whereas the 
graph of g(x) has the reverse change at point K. Consequently, instead of characterizing an 
inflection point as a point where the first derivative reaches an extreme valuc, we may 
alternatively characterize it as a point where the function undergoes a change in curvature 
or a change in the sign of its second derivative, 


An Application 
The two cutves in Fig. 9.5 exemplify the graphs of quadratic functions, which may be 
expressed generally in the form 


yrax'toxte (a #0) 


From our discussion of the second derivative, we can now derive a convenient way of 
determining whether a given quadratic function will have a strictly convex (U-shaped) or 
a strictly concave (inverse U-shaped) graph. 

Since the second derivative of the quadratic function cited is @’y/dx? = 2a, this deriv- 
ative will always have the same algebraic sign as the cocfficient a, Recalling that a positive 
second derivative implies a strictly convex curve, we can infer that a positive coefficient a 
in the preceding quadratic function gives rise to a U-shaped graph. tn contrast, a negative 
coefficient @ leads to a strictly concave curve, shaped like an inverted U. 

As intimated at the end of Sec. 9.2, the relative extremum of this function will alse prove 
to be its absolute extremum, because in a quadratic function there can be found only a 
single valley or peak, evident in a U or inverted U, respectively. 


Attitudes toward Risk 

The most common application of the concept of marginal utility is to the context of goods 
consumption, But in another useful application, we consider the marginal utility of income, 
or more to the point of the present discussion, the payoff to a betting game, and use this 
concept to distinguish between different individuals’ attitudes toward risk. 

Consider the game where, for a fixed sum of money paid in advance (the cost of 
the game), you can throw a die and collect $10 if an odd number shows up, or $20 if the 
number is even, [p view of the equal probability of the two outcomes, the mathematically 
expecied value of payaff is 


EV =0.5 x $10 +0.5 x $20 = $15 


232 Part Four Optimization Problems 


FIGURE 9.6 


UO) 


USS) 
EU 




















ua) 

N 

| 

| 

I | 
| | 

| 
| \ 

1 
1 
i} i} ! 
i} i ! 
! \ H EUG -- 
: ! I US15) $= - 
I H 
1 | ! 
H 1 1 
' \ 1 
1 i ' 
+ —+ 4 
10 15 20 x($) 0 5 10 15 20 x($) 

(a) (b) 


The game is deemed a fair game, or fair bei, if the cost of the game is exactly $15. Despite 
its fairness, playing such a game still involves a risk, for even though the probability distri- 
bution of the two possible outcomes is known, the actual result of any individual play is not. 
Hence, people who are “risk-averse” would consistently decline to play such a game. Gn 
the other hand, there are “risk-loving” or “risk-preferring” people who would welcome fair 
games, or even games with odds set against them (i.e., with the cost of the game exceeding 
the expected value of payoff). 

The explanation for such diverse attitudes toward risk is easily found in the differing 
utility functions people possess. Assume that 4 potential player has the strictly concave util- 
ity function LU’ = U(x) depicted in Fig, 9.6a, where x denotes the payoff, with U(0) = 0, 
U'(x) > 0 (positive marginal utility of income or payoff), and U"(x) < 0 (diminishing 
marginal utility) for all x. The economic decision facing this person involves the choice 
between two courses of action: First, by not playing the garne, the person saves the $15 cost 
of the game (= EV) and thus enjoys the utility level (($15), measured by the height of 
point 4 on the curve, Second, by playing, the person has a .5 probability of receiving $10 
and thus enjoying U($10) (see point Mf), plus a .5 probability of receiving $20 and thus 
enjoying U($20) (sce point N). The expected utility from playing is, therefore, equal to 


EU = 0.5 x U($10) + 0.5 x U($20) 


which, being the average of the height of M and that of N, is measured by the height of point 
B, the midpoint on the line segment MN. Since, by the defining property of a strictly con- 
cave utility function, line segment MN must lie below arc MN, point B must be lower than 
point 4; that is, EU, the expected utility from playing, falls short of the utility of the cost of 
the game, and the game should be avoided. For this reason, a strictly concave utility func- 
tion is associated with risk-averse behavior. 

For a risk-loving person, the decision process is analogous, but the opposite choice will 
be made, because now the relevant utility function is a strictly convex one. In Fig. 9.68, 


Chapter 9 Optimization: A Special Variety of Equilibrium Anadvvis 233 


U($15), the utility of keeping the $15 by not playing the game. is shown by point 4! on the 
curve, and EU, the expected utility from playing, is given by 4’, the midpoint on the line 
segment M’N’. But this time line segment MN’ lies above are Af’N’', and point B’ is above 
point 4’. Thus there definitely is a positive incentive to pay the game. In contrast to the sit- 
uation in Fig. 9.6a, we can thus associate a strictly convex utility function with risk-loying 





behavior. 
EXERCISE 9.3 
1. Find the second and third derivatives of the follewing functions: 
(0) ax? + bx +c (a) = O#N 
(b) Px4 3x4 (ptt (#1) 


2, Which of the following quadratic functions are strictly convex? 
(a) y=9x? -4x48 ( uso 2x2 
(b) ws —3x? +39 (d)v=8-Sxtx? 

3. Draw (a) a concave curve which is not strictly concave, and (6) a curve which qualifies 
simultaneously as a concave curve and a convex curve. 

4. Given the function y = a — — (a,b, ¢ > O: x = 0), determine the general shape of 
its graph by examining (a) its first and secand derivatives, (b) its vertical intercept, and 
(¢} the timit of y as x tends to infinity. If this function is to be used as a consumption func- 
tion, how should the parameters be restricted in order to make it economically sensible? 

5. Draw the graph of a function f(x} such that f’{x) = 0, and the graph of a function g(x) 
such that 9'(3) = 0. Summarize in one sentence the essential difference between f(x) 
and g(x) in terms of the concept of stationary point. 

6. A person who is neither risk-averse nor risk-loving (indifferent toward a fair game) fs 
said to be “risk-neutral.” 

(a) What kind of utility function would-you use to characterize such a person? 
(8). Using the die-throwing game detailed in the text, describe the relationship between 
U($15) and EU for the risk-neutral person, 


9.4 Second-Derivative Test 





Returning to the pair of extreme points 8 and £ in Fig. 9.5 and remembering the newly 
established relationship between the sccond derivative and the curvature of a curve, we 
should be able to see the validity of the following criterion for a telative extremum: 


Second-derivative test for relative extremum Jf the value of the first derivative of a tune- 
tion fatx = xo is fxg) = 0, then the value of the function at x9, f(x), will be 


a. A relative maximum if the second-derivative value at xy is f’(xa) < 0. 
b, Aretative minimum if the second-derivative value at xg is (x9) > 0. 


This test is in general more convenient to use than the first-derivative test, because it does 
not require us to check the derivative sign to both the left and the right of xp. But it has the 


234 = Part Four 


Example 1 


Example 2 


Optimization Problems 


drawback that no unequivocal conclusion can be drawn in the event that f”(x9) = 0. For then 
the stationary value f(x) can be either a relative maximum, or a relative minimum, or even 
an inflectional value.' When the situation of (x9) = Ois encountered, we must either revert 
to the first-derivative test, or resort to another test, to be developed in Sec. 9.6, that involves 
the third or even higher derivatives. For most problems in economics, however, the second- 
derivative test would usually be adequate for determining a relative maximum or minimum. 


Find the relative extremum of the function 
y= fQd= 42x 
The first and second derivatives are 
f(x) =8x-1 and P(id=8 
Setting f(x) equal to zero and solving the resulting equation, we find the (only) critical 
value to be x* = 7 which yields the (only) stationary value f } = —yg- Because the 
second derivative is positive (in this case it is indeed positive for any value of x), the ex- 


tremum is established as a minimum. Further, since the given function plots as a U-shaped 
curve, the relative minimum is also the absolute minimum. 





Find the relative extrema of the function 
y=g) = -3x742 
The first two derivatives of this function are 
g(x}=307-6x and —g’{x) = 6x -6 


Setting g'(x) equal to zero and solving the resulting quadratic equation, 3x2 — 6x =0, we 
obtain the critical values x} = 2 and x3 = 0, which in turn yield the two stationary vatues: 


9(2)=-2 — [a minimum because g"(2) = 6 > 0} 
G0) =2 [a maximum because g”(0) = —6 < 0] 





Necessary versus Sufficient Conditions 

As was the case with the first-derivative test, the zero-slope condition f’(x) = 0 plays the 
role of a necessary condition in the second-derivative test. Since this condition is based on 
the first-order derivative, it is often referred to as the first-order condition. Once we find the 
first-order condition satisfied at .x = xo, the negative (positive) sign of {"(xq) is sufficient 
to establish the stationary value in question as a relative maximum (minimum). These suf- 
ficient conditions, which are based on the second-order derivative, are often referred to as 
second-order conditions. 





1 To see that an inflection paint is possible when f(xo) = 0, let us refer back to Fig, 9.30 and 9.30’. 
Point jin the upper diagram is an inflection point, with x = as its critical value. Since the f'{x) 
curve in the lower diagram attains a minimum at x = j, the slope of f'{x) [ie., "(x)] must be zero 
at the critical value x = j. Thus point j illustrates an inflection point occurring when f”(x9) = 0. 
To see that a relative extremum is also consistent with #’(xo) = 0, consider the function y = x4, 
This function plots as a U-shaped curve and has a minimum, y = 0, attained at the critical value 
x = 0. Since the second derivative of this function is f"(x) = 12x2, we again obtain a zero value for 
this derivative at the critical value x = 0. Thus this function illustrates a relative extremum occurring 
when f’"{xo) = 0. 


TABLE 9.1 
Conditions for 
a Relative 
Extremum: 
yah 


Chapter 9 Optimization: 4 Special Variery of Equilibrium Analysis 235 





Condition Maximum Minimum 
First-order necessary Fix) 0 FQ) =0 

Second-order necessary’ PR) <0 P20 
Second-order sufficient’ Fx) <0 P(x) > 0 





“Applicable only after the first-order necessary condition has been satisfied. 


It bears repeating that the first-order condition is necessary, bul not sufficient, for a rel- 
alive maximum or minimum. (Remember inflection points?) In sharp contrast, the second- 
order condition that f"(x) be negative (positive) at the critical value xo is suffiefent for a 
relative maximum (minimum), but it is zof necessary, [Remember the relative extremum 
that occurs when f(x») = 0°] For this reason, one should carefully guard against the fol- 
lowing line of argument: “Since the stationary value (29) is already known to be a mini- 
mum, we must have f“{xg) > 0.” The reasoning here is faulty because it incorrectly treats 
the positive sign of f"{x9) as a necessary condition for (xg) to be a minimum. 

This is not to say that second-order derivatives can never be used in staling necessary 
conditions for relative extrema. Indeed they can. But care must then be taken to allow for 
the fact that a relative maximum (minimum) can occur not only when (x9) is negative 
(positive), but also when /"(xy) is zero, Consequently, second-order necessary conditions 
must be couched in terms of weak inequalities: for a stationary value f(x) 10 be a relative 


axi < 
| maximum , it is necessary that f’ (x9) | 5 | 0. 


Taimimum 
The preceding discussion can be summed up in Table 9.1. All the equations and in- 
cqualities in the table are in the nature of conditions (requirements) ta bc met, rather than 
descriptive specifications of a given function. In particular, the equation /“(x) = 0 does not 
signify that function f has a zero slope everywhere: rather, it states the stipulation that only 
those values of'x that satisfy this requirement can qualify as critical values. 





Conditions for Profit Maximization 
We shall now present an economic example of extreme-value problems, i.c., problems of 
optimization, 

One of the first things that a student of economics }earns is that, in order to maximize 
profit, a firm must equate marginal cost and marginal revenue. Let us show the mathemat- 
ical derivation of this condition. To keep the analysis on a general level, we shall work with 
the total-revenue function R = R(Q) and total-cost function C = C(Q), both of whieh are 
funetions of a single variable Q. From these it follows that a profit function (the objective 
function) may also be formulated in terms of Q (the choice variable): 


= m(2) = R(Q) - C(Q) G1) 


To find the profit-maximizing output level, we must satisfy the first-order necessary 
condition for a maximum: dx/d@ = 0. Accordingly, let us differentiate (9.1) with respect 
to Q and set the resulting derivative equal to zero: The result is 


dr “O) = PY O\ ct 
do = 7 IO) = RUB) cd) 
=0 iff R(Q)=C(Q) (9.2) 





236 Part Four Optimization Problems 


Thus the optimum output (equilibrium output) Q* must satisfy the equation R'(Q*) = 
C’'(Q*), or MR = MC. This condition constitutes the first-order condition for profit 
maximization. 

However, the first-order condition may lead to a minimum rather than a maximum, thus 
we must check the second-order condition next, We can obtain the second derivative by 
differentiating the first derivative in (9.2) with respect to Q: 

ax t n 
qn" (Q) = R"(Q) — C"(Q) 
<0 iff = R(Q)<C"(9) 


This last inequality is the second-order necessary condition for maximization. If it is not 
met, then Q* cannot possibly maximize profit; in fact, it minimizes profit. If R’(Q*) = 
C"(Q*), then we are unable to reach a definite conclusion. The best scenario is to find 
R'(O*) < C"(Q"), which satisfies the second-order sufficient condition for a maximum. 
In that case, we can conclusively take Q* to be a profit-maximizing output. Economically, 
this would mean that, if the rate of change of MR is less than the rate of change of MC at 
the output where MC = MR, then that output will maximize profit. 

These conditions are illustrated in Fig. 9.7. In Fig. 9.7@ we have drawn a total-revenue 
and a total-cost curve, which are seen to intersect twice, at output levels of 02 and Q4. In 
the open interval (Q2, Q4), total revenue R exceeds total cost C, and thus z is positive. But 
inthe intervals (0, Q2) and(Q4, Qs], where Qs represents the upper limit of the firm’s pro- 
ductive capacity, x is negative. This fact is reflected in Fig. 9.76, where the profit curve— 
obtained by plotting the vertical distance between the R and C curves for each level of 
output—lies above the horizontal axis only in the interval (Q2, Qa). 

When we set da /dQ = 0, in line with the first-order condition, it is our intention to 
locate the peak point K on the profit curve, at output 3, where the slope of the curve is 
zero. However, the relative-minimum point 44 (output Q;) will also offer itself as a candi- 
date, because it, too, meets the zero-slope requirement. Below, we shall resort to the 
second-order condition to eliminate the “wrong” kind of extremum. 

The first-order condition dx /dQ = 0 is equivalent to the condition R'(Q) = C’'(Q). In 
Fig, 9.7a, the output level Q; satisfies this, because the R and C curves do have the same 
slope at Q; (the tangent lincs drawn to the two curves at H and J are parallel to each other), 
The same is true for output Q;. Since the equality of the slopes of R and C means the equal- 
ity of MR and MC, outputs 03 and Q; must obviously be where the MR and MC curves 
intersect, as illustrated in Fig. 9.7<. 

How does the second-order condition enter into the picture’? Let us first look at Fig. 9.78. 
At point X, the second derivative of the x function will (barring the exceptional zero-value 
case) have a negative value, 1"((Q3) < 0, because the curve is inverse U-shaped around K; 
this means that Qj will maximize profit, At point 44, on the other hand, we would expect 
that "(Q1) > 0; thus Q; provides a relative minimum for x instead. The second- 
order sufficient condition for a maximum can, of course, be stated alternatively as 
R"(Q) < C"(Q), that is, that the slope of the MR curve be less than the slope of the MC 
curve. From Fig, 9.7e, it is immediately apparent that output Q; satisfies this condition, 
since the slope of MR is negative while that of MC is positive at point 2. But output Q) 
violates this condition because both MC and MR have negative slopes, and that of MR is 
numerically smaller than that of MC at point V, which implies that 2”(Q1) is greater than 


Chapter 9 Optimization: A Special Variety of Equilibrium Analvsix, 237 


FIGURE 9.7 RC 











1 
t 
K i 
' 







(iQ) RQ) — CQ) 
\ 















Oo 
Q 
iM 
1 
(®) 
MR, 
MC 
1 
MC = C1) 
1 
| 
{ { MR = X'(Q) 
! i 
+ + 
oa a Q 


© 


C’(Q1) instead. In fact, therefore, output Q; also violates the second-order necessary 
condition for a relative maximum, but satisfies the second-order sufficient condition for a 
relative minimum. 


238 Part Four Optimization Problems 


Example 3 


Let the R(Q) and C(Q) functions be 

R(Q) = 11,2000 - 20 

C(Q) = Q3 — 61.250? + 1,528.5Q+ 2,000 
Then the profit function is 

(Q) = —03 + 59.250 — 328.5Q - 2,000 


where R, C, and w are all in dollar units and Q is in units of (say) tons per week. This profit 
function has two critical values, Q = 3 and Q = 36.5, because 


dx 2 3 
7738 +118.5Q-3285=0 when Q= 365 
But since the second derivative is 
Pa >0 when Q=3 
aq > 52+ 185 {25 when Q = 36.5 


the profit-maximizing output is Q* = 36,5 (tons per week). (The other output minimizes 
profit.) By substituting Q* into the profit function, we can find the maximized profit to be 
at* = 136.5) = 16,318.44 (dollars per week). 

As an alternative approach to the preceding, we can first find the MR and MC functions 
and then equate the two, i.¢., find their intersection. Since 


R'(Q) = 1,200-4Q 
C'(Q) = 3Q? - 122.5Q + 1,528.5 


equating the two functions will result in a quadratic equation identical with dz/dQ = 0 
which has yielded the two critical values of Q cited previously, 





Coefficients of a Cubic Total-Cost Function 

In Example 3, a cubic function is used to represent the total-cost function. The traditional 
total-cost curve C = C(Q), as illustrated in Fig. 9.7a, is supposed to contain two wiggles 
that form a concave segment (decreasing marginal cost) and a subsequent convex segment 
(increasing marginal cost). Since the graph of a cubic function always contains exactly two 
wiggles, as illustrated in Fig, 9.4, it should suit that role well. However, Fig. 9.4 immedi- 
ately alerts us to a problem: the cubic function can possibly produce a downward-sloping 
segment in its graph, whereas the total-cost function, to make economic sense, should be 
upward-sloping everywhere (a larger output always entails a higher total cost). 1f we wish 
to use a cubic total-cost function such as 


C=C(0) =a +hG?+eO+d (9.3) 


therefore, it is essential to place appropriate restrictions on the parameters so as to prevent 
the C curve from ever bending downward. 

An equivalent way of stating this requirement is that the MC function should be positive 
throughout, and this can be ensured only if the absolute minimum of the MC function turns 
out to be positive. Differentiating (9.3) with respect to Q, we obtain the MC function 


MC =C(Q) =3aQ’+2bQ +e (9.4) 


Chapter 9 Optimization: A Special Variery of Equilibrium Analysts 239 


which, because it is a quadratic, plots as a parabola as in Fig. 9.7¢. In order for the MC 
curve to stay positive (above the horizontal axis) everywhere, it is necessary that the 
parabola be U-shaped (otherwise, with an inverse U, the curve is bound to extend itself into 
the second quadrant}, Hence the coefficient of the Q? term in (9.4) has to be positive; Le., 
we must impose the restriction a > 0. This restriction, however, is by no means sufficient, 
because the minimum value of a U-shaped MC curve—call it MCypin (a tclative minimum 
which also happens to be an absolute minimum)—may still occur below the horizontal 
axis, Thus we must next find MC and ascertain the parameter restrictions that would 
make it positive. 

According to our knowledge of relative extremum, the minimum of MC will occur 
where 


d 
=, MC = 6u0 + 2h =0 


dQ 
The output level that satisfies this first-order condition is 
awit 
: 6a 3a 


This minimizes (rather than maximizes} MC because the second derivative d{MC)/dQ? = 
6a is assuredly positive in view of the restriction a > 0, The knowledge of Q* now enables 
us to calculate MCyin, but we may first infer the sign of coefficient b from it. Inasmuch as 
negative output levels are ruled out, we see that 4 can never be positive (given a > 0), 
Moreover, since the law of diminishing returns is assumed to set in at a positive output level 
(that is, MC is assumed to have an initial declining segment), O* should be positive (rather 
than zero). Consequently, we must impose the restriction A < 0. 
It is a simple matter now to substitute the MC-minimizing output Q* into (9.4) to find 
that 
2 2 
MCmia = 3 (=) +a +o5 — 
Thus, to guarantee the positivity of MC,,in, we must impose the restriction! b? < 3¢c. This 
last restriction, we may add, in effect also implies the restriction c > 0, (Why’) 
The preceding discussion has involved the three parameters a, 6, and c, What about the 
other parameter, d? The answer is that therc is need for a restriction on d also, but that has 
nothing to do with the problem of keeping the MC positive. If we let @ = 0 in (9.3}, we find 


' This restriction may also be obtained by the method of completing the square. The MC function can 
be successively transformed as follows: 
MC = 3aQ? + 2bQ 4c 


b?\ be 
= 2 _j- 
= (220 + 2bQ 1 4] 3a +6 





2 
_ jb? —b? + 3ac 
= (vies =) + yy 


Since the squared expression can possibly be zero, we must, in order to ensure the positivity of MC, 
require that b? < 3a¢ on the knowledge that @ > 0. 


240 Part Four Optimization Problems 


Example 4 


that C(0) = d. The role of d is thus to determine the vertical intercept of the C curve only, 
with no bearing on its slope. Since the economic meaning of d is the fixed cost of a firm, 
the appropriate restriction (in the short-run context) would be d > 0. 

In sum, the coefficients of the total-cost function (9.3) should be restricted as follows 
(assuming the short-run context): 


acd>0 b<Q0 B<3ae (9.5) 
As you can readily verify, the C(Q) function in Example 3 does satisfy (9.5). 


Upward-Sloping Marginal-Revenue Curve 
The marginal-revenue curve in Fig. 9.7¢ is shown to be downward-sloping throughout. 
This, of course, is how the MR curve is traditionally drawn for a firm under imperfect com- 
petition. However, the possibility of the MR curve being partially, or even wholly, upward- 
sloping can by no means be ruled out a priori? 

Given an average-revenue function AR = /{Q), the marginal-revenue function can be 
expressed by 


MR= f(Q)+ Of(Q) [ftom (7.7)] 


The slope of the MR curve can thus be ascertained from the derivative 


iM = £'(Q) + FO) + OF O) = 2F(Q) + OF'(O) 

As long as the AR curve is downward-sloping (as it would be under imperfect competition), 
the 2 f'(Q) term is assuredly negative. But the Q7"(Q) term can be either negative, zero, 
or positive, depending on the sign of the second derivative of the AR function, Le., depend- 
ing on whether the AR curve is strictly concave, linear, or strictly convex. If the AR curve 
is strictly convex either in its entirety (as illustrated in Fig. 7.2) or along a specific segment, 
the possibility will exist that the (positive) Q/"(Q) term may dominate the (negative) 
2f'(Q) term, thereby causing the MR curve to be wholly or partially upward-sloping. 


Let the average-revenue function be 
AR= f(Q) = 8,000 - 23Q+1.1Q? — 0.018Q? 


As can be verified (see Exercise 9.4-7), this function gives rise to a downward-sloping AR 
curve, as is appropriate for a firm under imperfect competition. Since 
MR = f(Q) + QF'(Q) = 8,000 - 46Q + 3.3Q — 0.072Q3 
it follows that the slope of MR is 
a 2 
—=MR = —46 + 6.6Q-0. 
aoM® 6 + 6.6Q — 0.216Q 
Because this is a quadratic function and since the coefficient of Q? is negative, (MR/dQ must 
plot as an inverse-U-shaped curve against Q, such as shown in Fig. 9.5a, If a segment of this 
curve happens to lie above the horizontal axis, the slope of MR will take positive values. 


+ This point is emphatically brought out in John P. Formby, Stephen Layson, and W. James Smith, 
“The Law of Demand, Positive Sloping Marginal Revenue, and Multiple Profit Equilibria,” Economic 
inquiry, April 1982, pp, 303-311. 


Chapter 9 Optimization: 4 Special Variety of Equilibrium Analysis 241 


Setting aMR/dQ = 0, and applying the quadratic formula, we find the two zeros of the 
quadratic function to be Qy = 10.76 and Q2 = 19.79 (approximately). This means that, for 
values of Q in the open interval (Qi, Q2), the dMR/dQ curve does lie above the horizontal 
axis, Thus the marginal-revenue curve indeed is positively sloped for output levels between 
Qi and Q2. 

The presence of a positively sloped segment on the MR curve has interesting implica- 
tions. Such an MR curve may produce more than one intersection with the MC curve 
satisfying the second-order sufficient condition for profit maximization. While afl such 
intersections constitute local optima, however, only one of them is the glabal optimum that 
the firm is seeking, 





EXERCISE 9.4 
1. Find the relative maxima and minima of y by the: second-derivative test: 
(a) y = 2x? + 8x 4-25 Cys dat — 3x2 45x43 
2x 1 
=x 2 = = 
() y= 46249 @y=—*. («#5) 


2. Mr. Greenthumb wishes to mark out a rectangular flower bed, using a wall of his house 
as one side of the rectangle. The ather three sides are to be marked by wire netting, of 
which he has only 64 ft available..What-are the length 1 and width W of the rectangle 
that would give him the largest possible planting area? How do you make sure that 
your answer gives the largest, nat the smallest area? 


3, A firm has the following total-cost and demand functions: 
= 5Q?-7Q4111Q+50 
Q=100-P 
{a) Does the total-cost function satisfy the coefficient restrictions of (9.5)? 
(b) Write out the total-revenue function R in terms of Q. 
(0) Formulate the total-profit function z in terms of Q. 
(d) Find the profit-maximizing level of output Q*. 
(@) What is the maximum profit? 


4. If coefficient 5 in (9.3) were to take a zero value, what would happen to the marginal- 
cost and total-cost curves? 

$. A quadratic profit function 7(Q) = n@ + jQ + k is to be used to reflect the following 
assumptions: 
{a) If nothing is produced, the profit will be negative (because of fixed costs). 
(b) The profit function is strictly concave. 
() The maximum profit occurs at a positive output level Q*. 
What parameter restrictions are called for? 

6. A purely competitive firm has a single variable input £ (labor), with the wage rate Wp 
per period. Its fixed inputs cost the firm a total of F dollars per period. The price of the 
product is Py. 


(a) Write the productian function, revenue function, cost function, and profit function 
of the firm. 


242) Part Four Optimization Problems 


(b) What is the first-order condition for profit maximization? Give this condition an 
economic interpretation. 

(Q What economic circumstances would ensure that profit is maximized rather than 
minimized? 

7. Use the following pracedure to verify that the AR curve in Example 4 is negatively 

sloped: 

(a) Denote :the slope of AR by 5. Write:an expression for 5. 

(5) Find the maximum value of 5, Smnax, by using the second-derivative test. 

(Q Then deduce from :the value of Sia, that the AR curve is negatively sloped 
throughout. 


9.5 Maclaurin and Taylor Series 





The time has now come for us to develop a test for relative extrema that can apply even 
when the second derivative turns out to have a zero value at the stationary point. Before we 
can do that, however, it is first necessary to discuss the so-called expansion of a function 
¥ = f(x) into what are known, respectively, as a Maclaurin series (expansion around the 
point x = 0) and a Taylor series (expansion around any point x = xo}. 

To expand a function y = f(x) around a point xp means, in the present context, to trans- 
form that function into a polynomial form, in which the coefficients of the various terms are 
expressed in terms of the derivative values f’(xg), f"(x0), ete. —all evaluated at the point 
of expansion x9. In the Maclaurin series, these will be evaluated at x = 0; thus we have 
f'(0), f"(0), etc., in the coefficients. The result of expansion is a power series because, 
being a polynomial, it consists of a sum of power functions. 


Maclaurin Series of a Polynomial Function 
Let us consider first the expansion of a polynomia/ function of the ath degree, 
Sf (3) = ag + aye + yx? + ax? tagrt $+ + yx" (9.6) 


into an equivalent nth-degree polynomial where the coefficients (ap, a), etc.) are expressed 
instead in terms of the derivative values f’(0), f”(Q), etc. Since this involves the transforma- 
tion of one polynomial into another of the same degree, it may seem a sterile and purposeless 
exercise, but actually it will serve to shed much light on the whole idea of expansion. 

Since the power series after expansion will involve the derivatives of various orders of 
the function f, let us first find these. By successive differentiation of (9.6), we can get the 
derivatives as follows: 


f(x) =a) + 2anx + 3a3x? + 4agx? + + nage” 
f(x) = 2az + IQarx + AG)agx? +--+ n(n — Daya”? 

F(X) = 3Qag + 4B aux +--+ nt — It — a,x”? 

Sx) = 4(3)( Day + SAB) Qasr te tan(n — Gr = 2)(H — 3)ayx"4 


H 


f(x) = ae — De a - 3) DAM 


Example 1 


Chapter 9 Optimization: A Special Variety of Equilibrium Analysis 243 


Note thai each successive differentiation reduces the number of terms by one— the additive 
constant in front drops out—until, in the mth derivative, we are left with a single product 
term (a constant term). These derivatives can be evaluated at various values of x; here we 
shall evaluate them at x = 0, with the result that all terms involving x will drop out. We are 
then left with the following exceptionally neat derivative values: 
FOsa FO} 2a, f'™0) = 3(2)ay_ (0) = 403) (2) 
FOO) = ala — 1) = 2)(n = 3) BUDD (9.7) 
If we now adopt a shorthand symbol n! (read: “n factorial’’), defined as 

nl=a(n — Ia—2)--- 2) (4 = a positive integer) 

so that, for example, 2! = 2x |= 2 and 3!=3x2x1=6, etc. {with 0! defined as 


equal to 1), then the result in (9.7) can be rewritten as 


_f@ _f® _f'O LO) 
en Te er 
Substituting these into (9.6) and utilizing the obvious fact that f(0) = ap, we can now 
express the given function f(x) as a new, but equivalent, same-degree polynomial in which 
the coefficients are expressed in terms of derivatives evaluated atx = 0;* 
fH fO (/'O 2, £"O 5 

rn a a 

() 


iQ} 


nl 








ay 


SW= 





[Maclaurin’s formula] (9.8) 


This new polynomial, called the Maclaurin series of the polynomial function f(x), repro- 
sents the expansion of the function f(x) around zero (x = 0). Note that the point of 
expansion (here, 0) is simply the value of x that will be used to evaluate f(x) and all its 
derivatives. 


Find the Maclaurin series for the function 


Fx) = 24 4x 43x? (9.9) 
This function has the derivatives 
care sonal OnE 
Thus the Mactaurin series is 
F(x) = FO) + F(O)x + LO a 
=24 4x4 3x2 


The previous line verifies that the Maclaurin series does indeed correctly represent the given 
function. 


* Since 0! = 1 and 1! = 1, the first two terms on the right of the equals sign in (9.8) can be written 
more simply as f(0), and f(0)x, respectively. We have included the denominators 0! and 1! here to 
call attention to the symmetry among the various terms in the expansion. 


244 PartFour Optimization Problems 


Taylor Series of a Polynomial Function 

More generally, the polynomial function in (9.6) can be expanded around any point xo, not 
necessarily zero. Jn the interest of simplicity, we shall explain this by means of the specific 
quadratic function in (9.9) and generalize the result later. 

For the purpose of expansion around a specific point x9, we may first interpret any given 
value of x as a deviation from x». More specifically, we shall let x = x9 +8, where § 
represents the deviation from the value xp. Upon such interpretation, the given function 
(9.9) and its derivatives now become 

F(x) = 2 +Axo +8) + 30 + 87 
I(x) =4 + 6(x9 +8) (9.10) 
fa) =6 
We know that the expression (xy + 3) = x is a variable in the function, but since xp in the 
present context is a fixed (chosen) number, only 8 can be properly regarded as a variable in 
(9.10). Consequently, /(x) is in fact a function of 8, say, g(5): 


9(5) =2+409 +8) 430048" [= £09] 


with derivatives 





(5) =4 + 6% + 4) FR) 
g'(6)= [= 7’) 
We already know how to expand g(5) around zero (8 = 0). According to (9.8), such an 
expansion will yield the following Maclaurin scries: 
gM) g(0) 


es) = 29 £05, FOp 


oO 2! 1) 


But since we have letx = xp + 4, the fact that 6 = 0 implies x = x9; hence, on the basis of 
the identity 9(3) = f(x), we can write for the case of 8 = 0: 


(0) = fo) gO) = Fo) —g"(0) = Fan) 
Upon substituting these into (9.11), we find the result to represent the expansion of f(x) 
around the point x9, because the coefficients now involve the derivatives f'(xo), f"(x0), 
etc., all evaluated at x = x9: 


fol =e6y) = £20 4 POM — 594 AMG yy? 9.12) 





You should compare this result—the Taylor polynomial of f(x)—with the Maclaurin 
polynomial of g(8) in (9.11). 
Since for the specific function under consideration, (9.9), we have 


flo) = 244435 fro) =4+6x9 fn) = 6 
the Taylor polynomial in (9.12) becomes 
# (x) = 2 + Any + 39 + (4 + 6x0)(x = x0) + Hx - m0)? 
=244x 43x? 
This verifies that the Taylor polynomial does correctly represent the given function. 


Example 2 


Chapter 9 Optimization: A Special Variety of Equilibrium Analysis 245 


The expansion formula in (9.12) can be generalized to apply to the ath-degree polyno- 
mial of (9.6), The generalized formula is 





fe fa) + Pa) (x — x9) + Lee — a) + 
(a), 
+ 120, = x9)" [Taylor’s formula] (9,13) 


This differs from Maclaurin’s formula in (9.8) only in the replacement of zero by xo as the 
point of expansion, and in the replacement of x by the expression (x — xy). What (9.13) 
teils us is that, given an nth-degree polynomial f(x), if we let x = 7 (say) in the terms on 
the right of (9.13), select an arbitrary number xo, then evaluate and add these terms, we will 
end up exactly with (7)—the value of f(x) at x = 7. 


Taking xo = 3 as the point of expansion, we can rewrite (9.6) equivalently as 


pr (7). 
PO, ae pg DG ay 








F(x) = FQ) + FB) - 3+ 


Expansion of an Arbitrary Function 

Heretofore, we have shown how an ath-degree polynomial function can be expressed in 
another, equivalent, nth-degree polynomial form. As it turns out, it is also possible to 
express any arbitrary function ¢(x)—one that is not necessarily a polynomial—in a poly- 
nomial form similar to (9.13), provided @(x) has finite, continuous derivatives up to the 
desired order at the expansion point xo. 

According to a mathematical proposition known as Taylor's theorem, piven an arbitrary 
function (x), if we know the value of the function at x = xp [that is, ¢(xo)] and the val- 
ues of its derivatives at xo [that is, @’(x0), @”(x0), etc.], then this function can be expanded 
around the point x9 as follows (n = a fixed positive integer arbitrarily chosen): 


oa) = [= + 90), mo) + oe oF 





2 
Ol U! ~*0) 


Lx, 
pe 4 Oo) ee ~ aon 


=P,+R, [Taylor’s formula with remainder] (9.14) 


where P, represents the (bracketed) nth-degree polynomial [the first (# + 1) terms on the 
right], and 2, denotes a remainder, to be explained on page 248." The presence of 2, is 
what distinguishes (9.14) from Taytor’s formula (9.13), and for this reason (9.14) is called 
Taylor's formula with remainder. The form of the polynomial P,, and the size of the 
remainder R,, will depend on the value of 2 we choose. The larger the x, the more terms 
there will be in P,; accordingly, R, will in general assume a different value for each dif- 
ferent 1. This fact explains the need for the subscript n in these two symbols. As a memory 
aid, we can identify # as the order of the highest derivative in P,. {In the special case of 
n = 0, no derivative will appear in P,, at all.) 


t The symbol R,, (remainder) is not to be confused with the symbo! R” (n-space). 


246 Part Four Optimization Problems 


Example 3 


The appearance of R, in (9.14) is due to the fact that we are here dealing with an arbi- 
trary function @ which cannot always be transformed exactly into, but can only be approx- 
imated by, the polynomial form shown in (9.13). Therefore, a remainder term is included as 
a supplement to the F, part, to represent the discrepancy between d(x) and #,. Thus, Fy 
constitutes a polynomial approximation to (x), with the term 2, as a measure of the error 
of approximation. If we choose # = 1, for example, we have 


GA) = [O0v0) + oO) — Xo) + RL =P FR 


where P; consists of x + 1 = 2 terms and constitutes a finear approximation to $(x). If we 
choose 7 = 2, a second-power term will appear, so that 


930) 
2! 


where P), consisting of a+ 1] =3 terms, is a quadratic approximation to (x). And so 
forth. The fact that we can create polynomial approximations to any arbitrary function (pro- 
vided it hag finite, continuous derivatives) is of great practical significance. Polynomial 
functions—even higher-degree ones—are relatively easy to work with, and if they can 
serve as good approximations to some difficult functions, much convenience is to be 
gained, as the next two examples will illustrate. 

We should point out that the arbitrary function $(x) could obviously encompass the nth- 
degree polynomial of (9.6) as a special case. For this latter case, if the expansion is into 
another nth-degree polynomial, the result of (9.13) will exactly apply; or in other words, we 
can use the result in (9.14), with Ry = 0. However, ifthe given nth-degree polynomial f(x) 
is to be expanded into a polynomial of a lesser degree, then the latter can only be consid- 
ered an approximation to f(x), and a remainder must appear; in that case, the result in 
(9.14) can be applied with a nonzero remainder. Thus Taylor's formula in the form of (9.14) 
is perfectly general. 





oO) = [es + 6x0) (x — x0) + (= ww + Ry = P+ R 


Expand the nonpolynomial function 
1 
a= 


around the point Xo = 1, with n= 4. We shall need the first four derivatives of (x), which 
are 
1 


esta? sothat (1) =-(2)7= > 





ox) =A +x? oI) = 2(2y F= t 
go = 61 +04 g'0) = 64 = 2 
a) = 2401 + 2)°9 #1) = 24079 = 3 


Also, we see that ¢(1} = 5. Thus, setting xo = 1 in (9.14) and utilizing the obtained deriva- 
tives, we arrive at the following Taylor series with remainder: 


11g yale lee py te-nt 
OA) = 5 = GNF BIE Een Pt get Re 
3108 1, 3 


1 
~~ _ a we ey ye yt 
= 3g eta et tag 


Example 4 


FIGURE 9.8 


Chapter 9 Optimization: A Special Variety of Equilibrium Analysis 247 


It is possible, of course, to choose xo = 0 as the point of expansion here, too. In that 
case, with x set equal to zero in (9.14), the expansion will result in a Maclaurin series with 
remainder. 


Expand the quadratic function 
(x) = 5 4+2x 4x? 


around xo = 1, with = 1. This function is, like (9.9) in Example 1, a second-degree poly- 
nomial. But since n = 1, our assigned task is to expand it into a first-degree polynomial, 
i.e., to find a finear approximation to the given quadratic function; thus a remainder term 
is bound to appear. For this reason, (x) should be viewed as an “arbitrary” function for the 
purpose of this Taylor expansion. 

To carry out this expansion, we need only the first derivative ¢’(x) = 2+ 2x. Evaluated 
at Xo = 1, the given function and its derivative yield 


$0) = 91) =8 — oo) =H) =4 
Thus Taylor's formula with remainder gives us 


(x) = (40) + 6 (o)(x — Xo) + Ri 
=844(x-1) +8) =4+4xt Ri 


where the (4 + 4x) term is a linear approximation and the R; term represents the error of 
approximation. 

In Fig. 9.8, @(x) plots as a parabola, and its linear approximation as a straight line tan- 
gent to the ¢(x) curve at the point (1, 8). The occurrence of the point of tangency at x = 1 
is not a matter of coincidence; rather, it is the direct consequence of the fact that the paint 
of expansion is set at that particular value of x. This suggests that, when an arbitrary func- 
tion (x) is approximated by a polynomial, the latter will give the exact value of @(x) at 
{and only af) the point of expansion, with zero error of approximation (Ri = 0). Elsewhere, 
R} is strictly nonzero and, in fact, shows increasingly larger errors of approximation as we 


tx) = S + Ox tt 











248 Part Four Optimization Problems 


try to approximate (x) for x values farther and farther away from the point of expansion 
xq. Thus, when attempting te approximate any function (x) by a polynomial, if we are 
most interested in obtaining an accurate approximation in the neighborhood of a specific 
value of x, say xg, then we ought to choose xp as the paint of expansion. 

The construction of Fig. 9.8 is strongly reminiscent of Fig. 8.1. Indeed, both figures are 
concemed with “approximations.” But there is a difference in the scope of approximation. 
In Fig. 8.1, we attempt to approximate Ay by the differential dy with the help of a tangent 
line drawn at xo, a given starting value of x. In Fig. 9.8, on the other hand, we aim more 
broadly to approximate an entire curve by a particular straight line, i.e., to approximate the 
height of the curve at any value of x, say, x1, by the corresponding height of the straight line 
at x1. Note that, in both cases, the error of approximation varies with the value of x. In 
Fig. 8.1, the error (the difference between dy and Ay) gets smaller as Ax gets smaller, or as 
x gets closer to xo, at which the tangent line is drawn. In Fig. 9.8, the error (the vertical 
discrepancy between the straight line and the curve) gets smaller as x approaches xo, the 
chosen point of expansion. 


Lagrange Form of the Remainder 
Now we must comment further on the remainder term. According to the Lagrange form of 
the remainder, we can express R, as 


ge tN(p) 
“+! 


mn 


(x —49)"*" (9.15) 


where p is some number between x (the point where we wish to evaluate the arbitrary func- 
tion @) and xy (the point where we expand the function @). Note that this expression closcly 
resembles the term which should logically follow the last term in P, in (9.14), except that 
the derivative involved is here to be evaluated at a point p instead of xp. Since the point p 
is, unfortunately, not otherwise specified, this formula does not really enable us to calculate 
Ry; nevertheless, it does have great analytical significance. Let us therefore illustrate its 
meaning graphically, although we shall do it only for the simple case of » = 0. 

When » = 0, no derivatives whatever will appear in the polynomial part Py; therefore 
(9.14) reduces to 


(x) = Po + Ro = Oxo) + (px — x0) 
or Hx) — O(%0) = o'(p)(x - x0) 


This result, a simple version of the mean-value theorem, states that the difference between 
the value of the function ¢ at x and at any other x value can be expressed as the product 
of the difference (x — xy) and the derivative @’ evaluated at p (with p being some point 
between x and x9). Let us took at Fig, 9.9, where the function 4(x} is shown as a continu- 
ous curve with derivative values defined at all points. Let x» be the chosen point of expan- 
sion, and let x be aay point on the horizontal axis. If we try to approximate p(x), or distance 
xB, by $(x9), or distance xqA, it will involve an error equal to @(x) — O(xp), or the 
distance CB. What the mean-value theorem says is that the error CB. which constitutes 
the value of the remainder term Rp in the expansion—can be expressed as 6'(p)(x — Xu), 
where p is some point between x and xp. First we locate, on the curve between points 


FIGURE 9.9 


Chapter 9 Optimization: A Special Variety of Equilibrium Analysis 249 


y a 











i 

Hl 

i 

Hl 

I 

i 

I 

i 

! 

t 
+ 
Pp 


Aand B, a point D such that the tangent line at Dis parallel to line AB; such a point D must 
extst, since the curve passes from 4 to B in a continuous and smooth manner. Then, the 
remainder will be 


Ro =CB 





CB 
= 44 = (slope of 4B) - AC 


= (slope of tangent at D) - AC 
= (slope of curve at x = p)- AC 
= $'(pyx — x0) 


where the point p is between x and xg, as required. This demonstrates the rationale of the 
Lagrange form of the remainder for the case n =0. We can always express Ry as 
¢'(p)(e — xo) because, even though p cannot be assigned a specific value, we can be sure 
that such a paint exists. 

Equation (9.15) provides a way of expressing the remainder term 2,, but it does not 
eliminate &,, as a source of discrepancy between ¢(x) and the polynomial P,. However, if 
it happens that as we increase v (thus raising the degree of the polynomial) indefinitely, we 
find that 


R, > Oasn > 00 so that P, > (x) asn > co 


then the Taylor series is said to be convergent to @(x) at the point of expansion, and the 
Taylor series can be written as a convergent infinite series as follows: 


or) = 900 5 PO yyy MC 


7 7 xox) ++ (9.16) 





Note that the 2, term is no longer shown; in its place is an ellipsis signifying that the poly- 
nomial contains an infinite number of subsequent tetms whose mathematical structures 
follow the pattern indicated by the previous terms. In this (convenient) event, it will be pos- 
sible to make /, as accurate an approximation to (x) as we desire by choosing a large 
enough value for x, that is, by including a large enough number of terms in the polynomial 
P,, An important example of this will be discussed in Sec. 10,2, 


250 Part Four Optimization Problems 








EXERCISE 9.5 
1. Find the value of the following factorial expressions: 
Al (n+ 2)t 
@ 5! OF eas 
6! 
( 8! OF 
2. Find the first five terms of the Maclaurin series (i.e., choose 7 = 4 and let x9 = 0) for: 
1 l-x 
60) = Fy 66) = 


3. Find the Taylor series with n= 4 and x9 = —2, for the two functions in Prob. 2. 

4. On the basis of Taylor’s formula with the Lagrange form of the remainder {see (9.14) 
and (9.15)}, show that at the point of expansion (x = x9) the Taylor series will always 
give exactly the value of the function at that point, p(%o}, nat merely an approximation. 


9.6 Nth-Derivative Test for Relative 
Extremum of a Function of One Variable 





The expansion of a function into a Taylor (or Maclaurin) series is useful as an approxima- 
tion device in the circumstance that R, > Qasim —+ oo, but our present concern is with its 
application in the development of a general test for a relative extremum. 


Taylor Expansion and Relative Extremum 
AS a preparatory step for that task, fet us redefine a relative extremum as follows: 


A function f(x) attains a telative maximum (minimum) value at xp if f(x) — f(x) is 
negative (positive) for values of x in the immediate neighborhood of xp, both to its left and 
to its right, 


This can be made clear by reference to Fig. 9.10, where 11 is a value of x to the left of xo, 
and x2 is a value of x to the right of xo. In Fig. 9.102, f(xy) isa relative maximum; thus 
flap) exceeds both f(x1) and f(x). In short, f(x) — f (xo) is negative for any value of x 
in the immediate neighborhood of xo. The opposite is true of Fig. 9.106, where f(x») is a 
relative minimum, and thus f(x) — f(xo) > 0. 

Assuming f(x) to have finite, continuous derivatives up to the desired order at the point 
x =X, the function {(x)—not necessarily polynomial—can be expanded around the 
point xo as a Taylor series, On the basis of (9.14) (after duly changing ¢ tof), and using the 
Lagrange form of the remainder, we can write 


f(x) 2 


FO) — Fl) = fax — 0) + Sy & xg) to 
L(x0) LOOP) 1 
+ sy 40)" + Gan ~ xy)" (9.17) 


y 











Chapter 9 Opumization: A Special Variety of Equilibrium Analysis 251 























FIGURE 9,10 
y 

H ' ’ y= fw) 
1 F 1 
I \ \ 
1 1 1 
1 1 1 
1 1 1 1 
| \ ! 1 1 ! 
( | i \ \ ' 
{ i i 1 1 1 
I 1 | 1 | | 
I { \ \ ! | 
1 1 i} 1 i} | 

Fle | FCO | fonds | I | I 
i) 1 1 t 1 1 
t I FOS | fads Lf04 | 
1 ' i 
1 \ ! 1 1 \ 
1 1 1 i} 1 1 
1 \ ' 1 1 I 
\ 1 \ ( i r 
1 1 i} 1 1 i) 
\ ( \ I | ! 
i 1 1 1 1 ' 

Cp a tg ng 
% % % x o % Xo % x 
(a) ®) 
If the sign of the expression f(x) — f(x) can be determined for values of x to the imme- 


diate left and right of x9, we can readily come to a conclusion as to whether f(xq) is an 
extremum, and if so, whether it is a maximum or a minimum. For this, it is necessary to 
examine the right-hand sum of (9.17). Altogether, there are (# + 1) terms in this sum—a 
terms frem ?,, plus the remainder which is in the (# + 1)st degree—and thus the actual 
number of terms is indefinite, being dependent upon the value of 7 we choose. However, by 
properly choosing n, we can always make sure that there will exist only a single term on the 
right. This will drastically simplify the task of evaluating the sign of f(x) — f(xa) and 
ascertaining whether f(x) is an extremum, and. if so, which kind. 


Some Specific Cases 
This can be made clearer through some specific illustrations, 


Case 1 S'Go) #0 


If the first derivative at x9 is nonzero, let us choose n = 0, so that the remainder will be 
in the first degree. Then there will be only # + | = 1 term on the right side, implying that 
only the remainder Ro will be there. That is, we have 

f@) ” 
LO) ~ PO) = SiO — 10) = FRM ~ 20) 
where p is some number between xp and a value of x in the immediate neighborhood of xo. 
Note that p must accordingly be very, very close to xp. 

What is the sign of the expression on the right? Because of the continuity of the deriva- 

tive, f’(p) wili have the same sign as f’(xo) since, as mentioned before, p is very, very 





252 Part Four Optimization Problems 


close to xy. In the present case, f‘(p) must be nonzero; in fact, it must be a specific positive 
or negative number. But what about the (x — x9) part? When we go from the left of xo to its 
right, x shifts from a magnitude x, < x9 to a magnitude x; > xy (sec Fig. 9.10), Conse- 
quently, the expression (x — xo) must turn from negative to positive as we move, and 
f(x) — f(xo) = f'(p)e — xo) must also change sign from the left of xy to its right. How- 
ever, this violates our new definition of a relative extremum; accordingly, there cannot exist 
a relative extremum at f(x) when f"(xo) # 0 ~a fact that is already well known to us. 


Case 2 f(x) = 0; f'"(x0) #9 


In this case, choose n = I, so that the remainder will be in the second degree. Then 
initially there will be # + | = 2 terms on the right. But one of these terms will vanish 
because 7’(xo) = 0, and we shall again be left with only one term to evaluate: 


fe ) 





14x) — f(%o) = f' eax 0) + (x — x0? 
= ESN — x0) *Tpewause f(x) =O) 


As before, f"(2) will have the same sign as f"(xo), a sign that is specified and unvarying, 
whereas the (x — xo)” part, being a square, is invariably positive. Thus the expression 
f(x) — f(xp) must take the same sign as f”(x9) and, according to the earlier definition of 
relative extremum, will specify 

A relative maximum of f(x) if f"{x9) < 0 


ith f(x) = 0 
Arelative minimum of f(x) if (xo) > [with #°C0) =O] 


You will recognize this as the second-derivative test introduced earlier. 
Case3 S'(30) = Fn) = 0, but £9) #0 


Here we are encountering a situation that the second-derivative test is incapable of han- 
dling, for f"(x9) is now zero. With the help of the Taylor series, however, a conclusive 
result can be established without difficulty. 

Let us choose # = 2; then three terms will initially appear on the nght. But two of 
these will drop out because f’(x9) = f’(xy) = 0, so that we again have only one term to 
evaluate: 


Fla) — fo) = f’Co)(% — 20) + 5 af (x0) — x0)” +5 rPMaNer =) 


= if “(pyx —xoy' [because f"{xo) = 0, f"(20) = 0] 


As previously, the sign of f’"(9) is identical with that of f(xy) because of the continuity 
of the derivative and because p is very close to xo. But the (x — xo)* part has a varying sign. 

Specifically, since (x — xa) is negative to the left of xp, so also will be (x — x9)’, yet, to the 
right of xp, the (x ~x)* part will be positive. Thus there is a change in the sign of 
f(x) — (xo) as we pass through xo, which violates the definition of a relative extremum. 
However, we know that xp is a critical value [ f’(xo) = 0], and thus it must give an inflec- 
tion point, inasmuch as it docs not give a relative extremum. 


Case 4 Po) = f(a) = FOP) =O, but f(g) #0 


Chapter 9 Optimization: 4 Special Variety of Equilibrium Analysis. 253 


This is a very general case, and we can therefore derive a general result from it. Note 
that here all the derivative values are zero until we arrive at the Mth one. 
Analogously to the preceding three cases, the Taylor series for Case 4 will reduce to 


fla) = fo) = a FPN ~ m0) 


Again, f)(p) takes the same sign as f‘Y)(x9), which is unvarying. The sign of the 
(x — xo) part, on the other hand, will vary if N is odd (cf. Cases 1 and 3) and will remain 
unchanged (positive) if N is even (cf. Case 2). When N is odd, accordingly, f(x) — f(x») 
will change sign as we pass through the point xo, thereby violating the definition of a 
relative extremum (which means that xo must give us an inflection point on the curve). But 
when Nis even, f(x) — f(xo) will not change sign from the left of xo to its right, and this 
will establish the stationary value /(xo) as a relative maximum or minimum, depending on 
whether f)(xq) is negative or positive, 


Nth-Derivative Test 
At last, then, we may state the following general test, 


Nth-Derivative test for relative extremum of a function of one variable If the first 
derivative of a function f(x) at xo is f’(xp) = 0 and if the first nonzero derivative value at 
Xq encountered in successive derivation is that of the Mth derivative, (x9) # 0, then the 
stationary value f(%9) will be 


a, Arelative maximum if N is an even number and f(x9} < 0. 
b, Arelative minimum if N is an even. number but f(x) > 0. 
¢. An inflection point if N is odd. 


it should be clear from the preceding statement that the Nth-derivative test can work if 
and only if the function f(x) is capable of yielding, sooner or later, a nonzero derivative 
value at the critical value xp. While there do exist exceptional functions that fail to satisfy 
this condition, most of the furnctions we are likely to encounter will indeed produce nonzero 
(xq) in successive differentiation,’ Thus the test should prove serviceable in most 
instances. 


4 If #(x) is a constant function, for instance, then obviously f(x) = F(x) =--- =0, so that no 
nonzero derivative value can ever be found. This, however, is a trivial case, since a constant function 
Tequires no test for extremum anyway. As a nontrivial example, consider the function 


eV or x £0) 
i) (for x = 0) 
where the function y= 1? is an exponential function, yet to be introduced (Chap. 10). By 


itself, y = e-'" is discontinuous at x = 0, because x = 0 is not in the domain (division by zero is 
undefined). However, since jim y =0, we can, by appending the stipulation that y = 0 for x = 0, fill 





y= 


the gap in the domain and thereby obtain a continuous function. The graph of this function shows 
that it attains a minimum at x = 0. But it turns out that, at x = 0, all the derivatives (up to any order) 
have zero values, Thus we are unable to apply the Nth-derivative test to confirm the graphically 
ascertainable fact that the function has a minimum at x = 0. For further discussion of this exceptional 
case, see R. Courant, Differential and Integral Calculus (translated by E. J. McShane), Interscience, 
New York, vol. |, 2d ed., 1937, pp. 196, 197, and 336. 


254 Part Four Cptimization Problems 


Example 1 


Examine the function y = (7 — x)* for its relative extremum. Since f’(x) = —4(? — x)? is 
zero when x = 7, we take x = 7 as the critical value far testing, with y = 0 as the stationary 
value of the function. By successive derivation (continued until we encounter a nonzero 
derivative value at the point x = 7), we get 


fF’) =1207-— x)? — sothat = F*"(7) #0 
f'"(x) = -24(7 — x) (7) =0 
f(x) = 24 FO) = 24 


Since 4 is an even number and since #(4)(7) is positive, we conclude that the point (7, 0) 
represents a relative minimum. 

As is easily verified, this function plots as a strictly convex curve. Inasmuch as the second 

derivative at x = 7 is zero (rather than positive), this example serves to illustrate our earfier 
statement regarding the second derivative and the curvature of a curve (Sec. 9.3) to the 
effect that, while a positive f”(x) for all x does imply a strictly convex f(x), a strictly convex 
f(x) does not imply a positive f(x) for all x. More importantly, it also serves to illustrate the 
fact that, given a strictly convex (strictly concave) curve, the extremum found on that curve 
must be a minimum (maximum), because such an extremum will either satisfy the second- 
order sufficient condition, or, failing that, satisfy another (higher-order) sufficient condition 
for a minimum (maximum). 





EXERCISE 9.6 


1. Find the stationary values of the following functions: 
@y=0 (i) y= =x ( y= x45 
Determine by the Nth-derivative test whether they represent relative maxima, relative 
minima, or inflection points. 
2. Find the stationary values of the following functions: 
@ya@-17416  y=-x47 
(0) y= (x~ 2% (d} y= — 2) +8 
Use the Nth-derivative test.to determine. the exact nature of these stationary values. 


Chapter 





Exponential and 
Logarithmic Functions 


The Mth-derivative test developed in Chap. 9 equips us for the task of locating the extreme 
values of any objective function, as long as it involves only one choice variable, possesses 
derivatives to the desired order, and sooner or later yields a nonzero derivative value at the 
critical value xp. In the examples cited in Chap. 9, however, we made use only of polyno- 
mial and rational functions, for which we know how to obtain the necessary derivatives. 
Suppose that our objective function happened to be an exponential one, such as 


yas 


Then we are still helpless in applying the derivative criterion, because we have yet to Icarn 
how to differentiate such a function, This is what we shall do in the present chapter, 

Exponential functions, as well as the closely related logarithmic functions, have impor- 
tant applications in economics, especially in connection with growth problems, and in eco- 
homic dynamics in general. The particular application relevant to the present part of the 
book, however, involves a class of optimization problems in which the choice variable is 
time. For example, a certain wine dealer may have a stock of wine, the market value of 
which is known to increase with time in some prescribed fashion. The problem is to deter- 
Inine the best time to sell that stock on the basis of the wine-value function, after taking into 
account the interest cost involved in having the money capital tied up in that stock. Expo- 
nential functions may enter into such a problem in two ways. First, the value of the wine 
may increase with time according to some exponential law of growth. In that event, we 
would have an exponential wine-value function. Second, when we consider the interest 
cost, the presence of interest compounding will surely introduce an exponential function 
into the picture. Thus we must study the nature of exponential functions before we can 
discuss this type of optimization problem. 

Since our primary purpose is to deal with time as a choice variable, let us now switch to 
the symbol t—in lieu of x—to indicate thc independent variable in the subsequent discus- 
sion. (However, this same symbol f can very well represent variables other than time also.) 


255 


256 Part Four Optimization Problems 


10.1 The Nature of Exponential Functions 





As introduced in connection with polynomial functions, the term exponen means an indi- 
cator of the power to which a variable is to be raised. In power expressions such as x? or x°, 
the exponents are constanés; but there is no reason why we cannot also have a variable 
exponent, such as in 3* or 3‘, where the number 3 is to be raised to varying powers (various 
values of x or f). A function whose independeng variable appears in the role of an exponent 
is called an exponential function. 


Simple Exponential Function 
In its simpte version, the exponential function may be represented in the form 


yaf(Q=h (b> 1) (10.1) 


where y and ¢ are the dependent and independent variables, respectively, and 5 denotes a 
fixed base of the exponent. The domain of such a function is the set of all real numbers. 
Thus, unlike the exponents in a polynomial function, the variable exponent fin (10.1) is not 
limited to positive integers ~unless we wish to impose such a restriction. 

But why the restriction of b > 1? The explanation is as follows, Singe the domain of 
the function jn (10.1) consists of the set of all real numbers, it is possible for ¢ to take a 
value such as i If b is allowed to be negative, the half power of b will involve taking the 
square root of a negative number. While this is not an impossible task, we would certainly 
prefer to take the casy way out by restricting } to be positive. Once we adopt the restriction 
b > 0, however, we might as well go all the way to the restriction b > 1: The restriction 
h > 1 differs from 6 > 0 only in the further exclusion of the cases af (1) 0 <4 < | and 
(2) b = 1; but as will be shown, the first case can be subsumed under the testriction 6 > 1, 
whereas the second case can be dismissed outright. Consider the first case. If 8 = i then 
we have 


This shows that a function with a fractional base can easily be rewritten into one with a base 
greater than |, As for the second case, the fact that b= 1 will give us the function 
y =I’ =1, so that the exponential function actually degenerates into a constant function; 
it may therefore be disqualified as a member of the exponentiat family. 


Graphical Form 

The graph of the exponential function in (10.1) takes the gencral shape of the curve in 
Fig. 10.1. The curve drawn is based on the value ’ = 2; but even for other values of }, the 
same general configuration will prevail. 

Several salient features of this type of exponential curve may be noted. First, it is con- 
tinuous and smooth everywhere; thus the function should be everywhere differentiable. As 
a matter of fact, it is continuously differentiable any number of times. Second, it is strictly 
increasing, and in fact y increases at an increasing rate throughout, Consequently, both 
the first and second derivatives of the function y = 6‘ should be positive—a fact we should 


FIGURE 10.1 





Chapter 10 Exponential and Logarithmic Functions 257 





be able to confirm after we have developed the relevant differentiation formulas. Third, 
we note that, even though the domain of the function contains negative as well as positive 
numbers, the range of the function is limited to the open interval (0, 0). That is, the 
dependent variable y is invariably positive, regardless of the sign of the independent 
variable ¢, 

The strict monotonicity of the exponential function has at least two interesting and sig~ 
nificant implications. First, we may infer that the exponential function must have an inverse 
function, which is itself strictly monotonic. This inverse function, we shall find, turns out to 
be a logarithmic function. Second, since strict monotonicity means that there is a unique 
value of ¢ for a given value of y and since the range of the exponential function is the inter- 
val (0, 00), it follows that we should be able to express any positive number as a unique 
power of a base & > 1, This can be seen from Fig. 10.1, where the curve of y = 2' covers 
all the positive values of y in its range; therefore any positive value of y must be expressible 
as some unique power of the number 2. Actually, even if the base is changed to some other 
real number greater than 1, the same range holds, so that it is possible to express any posi- 
tive number y as a power of any base b > 1. 


Generalized Exponential Function 

This last point deserves closer scrutiny. If a positive y can indeed be expressed as powers of 
various alternative bases, then there must exist a general procedure of base conversion. In the 
ease of the function y = 9", for instance, we can readily transform it into y = (3°) = 3%, 
thereby converting the base from 9 to 3, provided the exponent ig duly altered from # to 2. 
This change in exponent, necessitated by the base conversion, does not create any new type 
of function, for, if we let w = 24, then y = 3% = 3” is still in the form of (10.1), From the 
point of view of the base 3, however, the exponent is now 2¢ rather than ¢. What is the effect 
of adding a numerical coefficient (here, 2) to the exponent ¢? 


258 Part Four Optimization Problems 


FIGURE 10.2 


y 
y= gays bt 


y= fp 2B 











(a) ) 


The answer is to be found in Fig. 10.2, where two cutves are drawn—one for the func- 
tion y = f() = bf and one for another function y = g(t) = b+, Since the exponent in the 
latter is exactly twice that of the former, and since the identical base is adopted for the two 
functions, the assignment of an arbitrary value f = % in the function g and ¢ = 2g in the 
function f must yield the same value: 


f2h) = alto) = F* = yo 


Thus the distance yoJ will be half of yaX. By similar reasoning, for any value of y, the 
function g should be exactly halfway between the function f and the vertical axis. It may 
be concluded, therefore, that the doubling of the exponent has the effect of compressing 
the exponential curve halfway toward the y axis, whereas haéving the exponent will extend 
the curve away from the y axis to Avice the horizontal distance. 

It is of interest that both functions share the same vertical intercept 

f(0) = g(0) =)" = 1 
The change of the exponent / to 2f, or to any other multiple of 4, will leave the vertical 
intercept unaffected. In terms of compressing, this is because compressing a zero horizon- 
tal distance will still yield a zero distance. 

The change of exponent is onc way of modifying—and generalizing- the exponential 
function of (10.1); another way is to attach a coefficient to 6’, such as 26' [Warning. 
2b! # (2b) | The effect of such a coelficient is also to compress or extend the curve, except 
that this time the direction is vertical. In Fig. 10.24, the higher curve represents » = 24+, 
and the lower one is y = B', For every value of ¢, the former must obviously be twice as 
high, because it has a y value twice as large as the latter. Thus we have tJ’ = J'K', Note 
that the vertical intercept, too, is changed in the present case, We may conclude that 
doubling the coefficient (here, from 1 to 2) serves to extend the curve away from the hori- 
zontal axis to twice the vertical distance, whereas halving the coefficient will compress the 
curve halfway toward the ¢ axis, 

With the knowledge of the two modifications just discussed, the exponential function 
y = Bf can now be generalized to the form 


y sabe (10.2) 


Chapter 10 Exponential and Logarithmic Functions 259 


where a and c are “compressing” or “extending” agents. When assigned various values, 
they will alter the position of the exponential curve, thus generating a whole family of 
exponential curves (functions). If @ and ¢ are positive, the general configuration shown in 
Fig. 10.2 will prevail; if @ or ¢ or both are negative, however, then fundamental modifica- 
tions will occur in the configuration of the curve (see Exercise 10.1-5). 


A Preferred Base 
What prompted the discussion of the change of exponent from / to cf was the question of 
base conversion. But, granting the feasibility of base conversion, why would one want to do 
it anyhow? One answer is that some bases are more convenient than others as far as math- 
ematical manipulations are concerned. 

Curiously enough, in calculus, the preferred base happens to be a certain irrational nam- 
ber denoted by the symbol e: 


e = 2.71828... 
When this base ¢ is used in an exponential function, it is referred to as a natural exponen- 
tial function, examples of which are 
yee yse™ y= de” 
These illustrative functions can also be expressed by the alternative notations 
yeexp()  y=exp(3t) y= Aexp(rt) 


where the abbreviation exp (for exponential) indicates that ¢ is to have ag its exponent the 
expression in parentheses, 

The choice of such an outlandish number as ¢ = 2.71828... as the preferred base no 
doubt seems bewildering. But there is an excellent reason for this choice, for the function 
e* possesses the remarkable property of being its own derivative! That is, 

d t if 

it ere 
a fact that reduces the work of differentiation to no work at all. Moreover, armed with this 
differentiation rule—to be proved in Section 10.5—it will also be easy to find the deriva- 
tive of a more complicated natural exponential function such as y = Ae’. To do this, first 
let w = rt, so that the function becomes 


= de” where w= rt, and A,r are constants 
y 


Then, by the chain rule, we can write 





dy _dydw_ ay yt 
dt dwdt 
That is, 
f ae =rde” (10.3) 
dt . 


The mathematical convenience of the base ¢ should thus be amply clear, 


260 Part Four Optimization Probleme 





EXERCISE 10.1 


1, Prot in a single diagram the graphs of the exponential functions y= 3! and y = 3%, 
(a) Do the twa graphs display the same general positional relationship as shown in 
Fig. 10,20? 
(b) Do these two curves share the same y intercept? Why? 
(Q Sketch the graph of the function y = 3} in the same diagram. 
2. Plot in a single diagram the graphs of the exponential functions y = 4° and y = 3(4'). 
(a) Do the two graphs display the general positional relationship suggested in 
Fig, 10.26? 
(b) Do the two curves have the same y intercept? Why? 
(©) Sketch the graph of the function y = 349 in the same diagram. 
. Taking for granted that e! is its own derivative, use the chain rule to find dy/at for the 
following: 
(a) yee (0) y= 4e% () y= 6e" 

4. In view of our discussion about (10.1), do you expect the function y = et to be strictly 
increasing at an increasing rate? Verify your answer by determining the signs of the first 
and second derivatives of this function. in doing so, remember that the domain of this 
function is the set of all real numbers, ie., the interval (00, 00}. 

5, In (10.2), if negative values are assigned to a and c, the general shape of the curves in 
Fig. 10.2 will no longer prevail. Examine the change in curve configuration by con- 
trasting (a). the case of a = —1 against the case of a= 1, and (b) the case of c= -1 
against the case of ¢ = T, 


w 


10.2. Natural Exponential Functions 
and the Problem of Growth 





The pertinent questions still unanswered are: How is the number ¢ defined? Does it have 
any economic meaning in addition to its mathematical significance as a convenient base? 
And, in what ways do natural exponential functions apply to economic analysis? 


The Number e 


Let us consider the following function: 
1" 
f(w= (: + <) (10.4) 


If larger and larger values are assigned to m, then f(a) will also assume larger values: 
specifically, we find that 


fa (tq) =2 
f)= (144)? = 2.25 
FQ) = (14 LP = 2.37037 
fy = (14 dy = 2.44141 


Chapter 10 Exponential and Logarithmic Functions 261 


Moreover, if m is increased indefinitely, then f(m)} will converge to the number 
2.71828... = e: thus e may be defined as the limit of (10.4) as 2 > 00: 


e= Jin Alm) = fim, (142) (10.5) 


That the approximate value of e is 2.71828 can be verified by finding the Maclaurin 
series of the function @(x) = e*—with x used here to facilitate the direct application of the 
expansion formula (9.14), Such a series will give us a polynomial approximation to e*, and 
thus the value of e (= e!) may be approximated by setting x = 1 in that polynomial. If the 
remainder term R,, approaches zero as the number of terms in the series is increased indef- 
initely, ie., if the series is convergent to p(x), then we can indeed approximate the valuc of 
e to any desired degree of accuracy by making the number of included terms sufficiently 
large. 

To this end, we need to have derivatives of various orders for the function. Accepting the 
fact that the first derivative of e* is e* itself, we can see that the derivative of (x) is simply 
e* and, similarly, that the second, third, or any higher-order derivatives must be e* as well. 
Hence, when we evaluate all the derivatives at the expansion point (x9 = 0), we have the 
gratifyingly neat result 


$0) = 9") =. = 90) =e =1 
Consequently, by setting x9 = 0 in (9.14), the Maclaurin series of e* is 


. o"(0) OH. 4 £20) 
nt 


& =4(2) = 00) + 9 Ox + [HP t+ ; x" + Ry 


1 
eltxtye gets +5 x" 4+ R, 


The remainder term &,,, according to (9.15), can be written as 


1 
gt(p) me Png =e] 


_ (#41), _ oe tet, 
Be GED @ PDP oN) = es Pp) 





Inasmuch as the factorial expression (4+ 1)! increases in value more rapidly than the 
power expression x"+! (for a finite x) as » increases, it follows that R, > 0as in — ov. 
Thus the Maclaurin series converges, and the value of e* may, as a result, be expressed as a 
convergent infinite series as follows; 


1 1 1, 1, 
faltxtye +a +a +a tee (10.6) 
Asa special case, for x = |, we find that 
1 1 i 1 
e=lt+ltytytgtat 


=2+0.5 + 0.1666667 + 0.0416667 + 0.0083333 + 0.0013889 
+ 0,0001984 + 0.0000248 + 0.0000028 + 0.0000003 + --- 


= 2,7182819 


262 PartFour Optimization Problems 


Thus, if we want a figure accurate to five decimal places, we can write ¢ = 2.71828. Note 
that we need not worry about the subsequent terms in the infinite scries, because they will 
be of negligible magnitude if we are concerned only with five decimal places. 


An Economic Interpretation of e 

Mathematically, the number e¢ is the limit expression in (10.5). But does it also possess 
some economic meaning? The answer is that it can be interpreted as the result of a special 
mode of interest compounding. 

Suppose that, starting out with a principal (or capital) of $1, we find a hypothetical 
banker to offer us the unusual interest rate of 100 percent per annum ($1 interest per year). 
If interest is to be compounded once a year, the value of our asset at the end of the year will 
be $2: we shall denote this value by (1), where the number in parentheses indicates the 
frequency of compounding within 1 year: 

V(1) = initial principal (1 + interest rate) 
= 1(1+100%) = (144) =2 

If interest is compounded semiannually, however, an interest amounting to 50 percent 
(half of 100 percent) of principal will accrue at the end of 6 months, We shall therefore have 
$1.50 as the new principal during the second 6-month period, in which interest will be 
calculated at 50 percent of $1.50. Thus our year-end asset value will be 1.50(1 + 50%); 
that is, 


¥(2) = (1 + 50%)(1 + 50%) = (14 4) 


By analogous reasoning, we can write /(3) = (1+ 3%, va=a(t yf ete. or, in 
general, 

wn 

Vin) = (1 + =) (10.7) 

m 

where m represents the frequency of compounding in 1 ycar. 
in the limiting case, when interest is compounded continuously during the year, i.e., 

when m becomes infinite, the value of the asset will grow in a “snowballing” fashion, 
becoming at the end of | year 


\" 
Tim V(r) = lim (: + 3) =e(dollars} — [by (10.5)] 
moc moc m 


Thus, the number ¢ = 2.71828 can be interpreted as the year-end value to which a princi- 
pal of $1 will grow if interest at the rate of 100 percent per annum is compounded 
continuously, 

Note that the interest rate of 100 percent is only a nominal interest rate, for if $1 
becomes $e = $2.718 after 1 year, the effective interest rate is in this case approximately 
172 percent per annum. 


Interest Compounding and the Function Ae* 

The continuous interest-compounding process just discussed can be generalized in three 
directions, to allow for: (1) more years of compounding, (2) a principal other than $1, and 
(3) anominal interest rate other than 100 percent. 


Chapter 10 Exponential and Logarithmic Functions 263 


Ifa principal of $1 becomes $e after 1 year of continuous compounding and if we let $e 
be the new principal in the second year (during which every dollar will again grow into $e), 
our asset value at the end of 2 years will obviously become $e (e) = $e’. By the same token, 
it will become $e? at the end of 3 years or, more generally, will become Se’ after ¢ years. 

Neat, let us change the principal from $1 to an unspecified amount, $4. This change is 
easily taken care of: if $1 will grow into $e! after ¢ years of continuous compounding at the 
nominal rate of 100 percent per annum, it stands to reason that $4 will grow into $4e’. 

How about a nominal interest rate of other than 100 percent, for instance, r = 0.05 
{= 5 percent)? The effect of this rate change is to alter the expression Ae’ to de”! as can be 
verified from the following. With an initial principal of $4, to be invested for 1 years at a 
nominal interest rate r, the compound-interest formula (10.7) must be modified to the form 


vim) =A (1+ *\" (10.8) 


The insertion of the coefficient 4 reflects the change of principal from the previous level of 
$1. The quotient expression r/m means that, in each of the # compounding periods in a 
year, only 1/m of the nominal rate r will actually be applicable. Finally, the exponent mt 
tells us that, since interest is to be compounded m times a year, there should be a total of mz 
compoundings in ¢ years. 

The formula (10.8) can be transformed into an alternative form 


mir rt 

rim=a| (t+!) | 
ert 

=A (+5)) where w = — 

Ww r 


As the frequency of compounding mt is increased, the newly created variable w must 
increase pari passu; thus, as m — 00, we have w -+ co, and the bracketed expression in 
(10.8'), by virtue of (10.5), tends to the number e. Consequently, we find the asset value in 
the generalized continuous-compounding process to be 


¥ = lim V(m) = Ae” (10,8") 


(10.8) 


3 


as anticipated. 

Note that, in (10.8), fis a discrete (as against a continuous) variable: It can only take val- 
ues that are integral multiples of 1/m. For example, if m = 4 (compounding on a quarterly 
basis), then # can only take the values of t, 3, 3, 1, etc., indicating that V(m) will assume 
a new value only at the end of each new quarter. When m — 00, as in (10.8”), however, 
1/m becomes infinitesimal, and accordingly the variable ¢ will become continuous. In that 
case, it becomes legitimate to speak of fractions of a year and to let 1 be, say, 1.2 or 2.35. 

The upshot is that the expressions e, e’, Ae’, and Ae” can all be interpreted economically 
in connection with continuous interest compounding, as summarized in Table 10.1. 


Instantaneous Rate of Growth 


It should be pointed out, however, that interest compounding is an illustrative, but not 
exclusive, interpretation of the natural exponential fimction Ae”. Interest compounding 


264 Part Four Optimization Problems 





TABte 10.1 Nominal Years of Continuous ‘Asset Value, at the End of 
Interest Principal, $ Interest Rate Compounding Compounding-Process, $ 
1 1009 (21) 3 e 
1 100% t et 
A 100% t Act 
A r t Aet 








merely exemplifies the general process of exponential growth (here, the growth of a sum 
of money capital over time), and we can apply the function cqually well to the growth of 
population, wealth, or real capital. 

Applied to some context other than interest compounding, the coefficient 7 in de"! no 
longer denotes the nominal interest rate, What economic meaning does it then takc? The 
answer is that r can be reinterpreted as the instantaneous rate of growth of the function 
Ae”. (In fact, this is why we have adopted the symbol r, for rate of growth, in the first 
place.) Given the function V = de”, which gives the valuc of V’ al each point of time ¢, the 
rate of change of Vis to be found in the derivative 


—- = rhe orl see (10.3 

a r Ae [see ¢ }] 

But the rate of growth of V is simply the rate of change in V expressed in relative 
(percentage) terms, i-e., expressed as a ratio to the value of V itself. Thus, for any given 
point of time, we have 

aVidt ork 


Rate of growth of ¥ = =A r (10.9) 


as was stated previously, 

Soveral observations should be made about this rate of growth. But, first, let us clarify a 
fundamental point regarding the concept of time, namely, the distinction between a point of 
time and a period of timc. The variable V (denoting a sum of money, or the size of popula- 
tion, etc.) is a stock concept, which is concerned with the question: How much of it exists 
at a given moment’ As such, V is related to the point concept of time; at cach point of time, 
¥ takes a unique value. The change in V, on the other hand, represents a flow, which 
involves the question: How much of it takes place during a given time span? Hence a 
change in V and, by the same token, the rate of change of Y must have reference to some 
specified period of time, say, per year. 

With this understanding, let us return to (10,9) for some comments: 





1. The rate of growth defined in (10.9) is an instantaneous rate of growth. Since the deriv- 
ative dV /dt =r Ae” takes a different value at a different point of /, as will / = Ae, 
their ratio must also have reference to a specific point (or instant) of ¢. in this sense, the 
rate of growth is instantaneous. 

2. In the present case, however, the instantaneous rate of growth happens to be a constant 
y, with the rate of growth thus remaining uniform at all points of time. This may not, of 
course, be true of all growth situations actually encountered. 


Chapter 10 Exponential and Logarithmic Functions 265 


3, Even though the rate of growth r is measured at a particular point of time, its magnitude 
nevertheless has the connotation of so many percent per unit of time, say, per year (if fis 
measured in year units). Growth, by its very nature, can occur only over a time interval. 
This is why a single still picture (recording the situation at one instant) could never por- 
tray, say, the growth of a child, whereas two still pictures taken at different times—say, 
a year apart—can accomplish this. To say that V has a rate of growth of r at the instant 
t = fy, therefore, really means that, if the rate of change dV /d#(=rV} prevailing at 
t = f is allowed to continue undisturbed for one whole unit of time (1 year), then V will 
have grown by the amount rV at the end of the year. 

4. For the exponential function V = de”, the percentage raie of growth is constant at all 
points of ¢, but the absolute amount of increment of V increases as time goes on, because 
the percentage rate will be calculated on larger and larger bases. 


Upon interpreting r as the instantaneous rate of growth, it is clear that little effort will 
henceforth be required to find the rate of growth of a natural exponential function of the 
form y = de", provided r is a constant. Given a function y = 75¢°™, for instance, we can 
immediately read off the rate of growth of y as 0.02 or 2 percent per period. 


Continuous versus Discrete Growth 

The preceding discussion, though analytically interesting, is still open to question insofar 
as economic relevance is concerned, because in actuality growth does not always take place 
on a continuous basis—not even in interest compounding. Fortunately, however, even for 
cases of discrete growth, where changes occur only once per period rather than from instant 
to instant, the continuous exponential growth function can be justifiably used. 

For one thing, in cases where the frequency of compounding is relatively high, though 
not infinite, the continuous pattern of growth may be regarded as an approximation to the 
true growth pattern, But, more importantly, we can show that a problem of discrete or 
discontinuous growth can always be transformed into an equivalent continuous version, 

Suppose that we have a geometric pattern of growth (say, the discrete compounding of 
interest) as shown by the following sequence: 


A, ACL +20), AL +a, ACL +a)’, 


where the effective interest rate per period is denoted by i and where the exponent of the 
expression (1 + i) denotes the number of periods covered in the compounding. If we con- 
sider (1 + i) to be the base 5 in an exponential expression, then the given sequence may be 
summarized by the exponential function 4'—except that, because of the discrete nature 
of the problem, ¢ is restricted to integer values only, Moreover, b = | + / is a positive num- 
ber (positive even if i is a negative interest rate, say, —0.04), so that it can always be 
expressed as a power of any real number greater than 1, including e. This means that there 
must exist a number r such that 


l+i=b=e" 


T The method of finding the number ¢ given a specific value of b, will be discussed in Sec. 10.4. 


266 Part Four = Opfimization Problems 


Thus we can transform 4d! into a natural exponential function: 
AQ +i)! = Ab! = Ae” 


For any given value of ¢ —in this context, integer values of (the function Ae” will, 
of course, yield exactly the same value as 4(1+/)', such as A(1+/) = Ae’ and 
A(t +i)? = Ae”. Consequently, even though a discrete case A(1 +i}! is being consid- 
ered, we may still work with the continuous natural exponential function 4e”’, This 
explains why natural exponential functions are extensively applied in economic analysis 
despitc the fact that not all growth patterns may actually be continuous. 


Discounting and Negative Growth 

Let us now turn briefly from interest compounding to the closely related concept of 
discounting. In a compound-interest problem, we seek to compute the future value V (prin- 
cipal plus interest) from a given present value A (initial principal). The problem of 
discounting is the opposite one of finding the present value A of a given sum F’ which is to 
be available ¢ years from now, 

Let us take the discrete case first. If the amount of principal 4 will grow inte the fulure 
valuc of A(1 + /)! after ¢ years of annual compounding at the interest rate 7 per annum, 
Le., if 

V=A(l+iy 


then, by dividing both sides of the equation by the nonzero expression (1 + i)’, we can get 
the discounting formula: 


vy at 
4= oy = VU +i) (10.10) 
which involves a negative exponent. It should be realized that in this formula the roles of V 
and 4 have been reversed: V is now a given, whereas 4 is the unknown, to be computed 
from / (the rate of discount) and ¢ (the number of years), as well as V. 
Similarly, for the continuous case, if the principal A will grow into Ae’” after ¢ years of 
continuous compounding at the rate r in accordance with the formula 


V = Ae” 


then we can derive the corresponding continuous-discounting formuta simply by dividing 
both sides of the last equation by e”': 


=Ver (10.11) 


Here again, we have A (rather than ’) as the unknown, to be computed ftom the given 
future value V, the nominal rate of discount r, and the number of years £, The expression 
e~" js often referred to as the discount factor. 

Taking (10.11) as an exponential growth function, we can immediately read —r as the 
instantaneous rate of growth of 4. Being negative, this rate is in effect a rate of decay. Just 
as interest compounding cxemplifies the process of growth, discounting illustrates negative 
growth. 


Chapter 10 Exponential and Logarithmic Functions 267 





EXERCISE 10,2 
1. Use the infinite-series form of ¢* in (10.6) to find the approximate value of: 
(@e (b) fe (=e) 
(Round off your calculation of each term to three decimal places, and continue with the 
series till you get a term 0.000.) 


2, Given the function (x) = e?*: 
{a) Write the polynomial part P, of its Maclaurin series. 
(b) Write the Lagrange form of the remainder R,. Determine whether &,— 0 as 
n— ox, that is, whether the series is convergent to @(x). 
(Q If convergent, sa that ¢{x) may be expressed as an infinite series, write out this 
series. 
3, Write an exponential expression for the value: 
{a} $70, compounded continuously at the interest rate of 4% for 3 years 
(b) $690, compounded continuously at the interest rate of 5% for 2 years 
(These interest rates are nominal rates per annum.) 
4, What is the instantaneous rate of growth of y in each of the following? 
(y= OO” © ya Aet 
(b) y= 15008 (d):y = 0.03et 
5. Show that the two functions y; =/Ae™ (interest compounding) and y, = Ae‘ 
(discounting) are: mirror images of each other with reference to the y axis (cf. Exercise 
70.1-5, part (6)}. 


10.3 Logarithms 


Exponential functions are closely related to /ogarithmic functions (log fimctions, for short). 
Before we can discuss log functions, we must first understand the meaning of the term 
logarithm. 





The Meaning of Logarithm 
When we have two numbers such as 4 and 16, which can be related to each other by the 
equation 4" = 16, we define the exponent 2 to be the logarithm of 16 to the base of 4, and 
write 

log, 16 = 2 


It should be clear from this exampte that the logarithm is nothing but the power to which a 
base (4) must be raised to attain a particular number (16). in general, we may state that 


y= & t=logy (10,12) 


which indicates that the log of y to the base b (denoted by log, y) is the power to which the 
base & must be raised in order to attain the value y. For this reason, it is correct, though 
tautological, to write 


bls? = y 


268 Part four Qptimization Problems 


Given y, the process of finding its logarithm log, y is referred to as taking the log af y fo the 
base b, The reverse process, that of finding y from a known value of its logarithm log, y, is 
referred to as saking the untilog of log, y. 

In the discussion of exponential functions, we emphasized that the function y = 6 
(with b > 1) is strictly increasing. This means that, for any positive value of y, there is a 
unique exponent ¢ (not necessarily positive) such that y = 6, moreover, the larger the value 
ofy, the larger niust be f, as can be seen from Fig. 10.2. Translated into logarithms, the strict 
monotonicity of the exponential function implies that any positive number y: must possess 
a unique logarithm ¢ to a base b > | such that the larger the y, the larger its logarithm. 
As Figs. 10.1 and 10.2 show, y is necessarily positive in the exponential function y = 6°; 
consequently, a negative number or zero cannot possess a logarithm. 


Common Log and Natural Log 
The base of the logarithm, 4 > 1, does not have to be restricted to any particular number, 
but in actual Jog applications two numbers are widely chosen as bases—the number 10 and 
the number e. When 10 is the base, the logarithm is known as the common logarithm, sym- 
bolized by log, (or if the context is clear, simply by log). With ¢ as the base, on the other 
hand, the logarithm is referred to as the natural logarithm and is denoted either by log, or 
by In (for natural log). We may also use the symbol log (without subscript e) if it is not 
ambiguous in the particular context. 
Common logarithms, used frequently in computational work, are exemplified by the 

following: 

Jog;y 1,000 = 3 [because 107 = 1,000] 

log)y 100 = 2 [because 10? = 100] 

logy 10 = 1 [because 10! = 10] 

logyy 1 =0 [because 10° = 1} 

logig0.1 =-I [because 107! = 0.1] 

log) 0.01 =—2 — [because 10-? = 0.01] 
Observe the close relation between the sct of numbers immediately to the left of the equals 
signs and the set of numbers immediately to the right. From these, it should be apparent that 
the common logarithm of a number between 10 and 100 must be between | and 2 and that 
the common logarithm of a number between 1 and 10 must be a positive fraction, etc. The 
exact logarithms can easily be obtained from a table of common logarithms or electronic 
calculators with log capabilities." 

In analytical work, however, natural logarithms prove vastly more convenient to use 
than common logarithms. Since, by the definition of logarithm, we have the relationship 
yae & ftealogy (orf=lny) 10.13) 

it is easy to see that the analytical convenience of ¢ in exponential functions will autornat- 
ically extend into the realm of logarithms with ¢ as the base. 


¥ More fundamentally, the value of a logarithm, like the value of e, can be calculated (or 
approximated) by resorting to a Maclaurin expansion of a log function, in a manner similar to that 
outlined in (10.6). However, we shall not venture into this derivation here. 


Example 1 


Example 2 


Chapter 10. Exponential and Logarithmic Functions 269 


The following examples will serve to illustrate natural logarithms: 


Inc? = Jog, ¢? = 3 





ine? = log, =2 
Ine! = log, ¢! =l 
Ini =log.e° = 0 


log, et=-l 





The general principle emerging {rom these examples is that, given an expression e4, where 
4 is any real number, we can automatically read the exponcnt & as the natural log of e*. In 
general, therefore, we have the result that In e = &.t 

The common log and natural log are convertible into cach other, i.¢., the base of a loga- 
rithm can be changed, just as the base of an exponential expression can. A pair of conver- 
sion formulas will be developed after we have studied the basic rules of logarithms. 


Rules of Logarithms 

Logarithms are in the nature of exponents; therefore, they obey certain rules closely related 
to the rules of exponents introduced in Sec. 2.5. These can be of great help in simplifying 
mathematical operations. The first three rules are stated only in terms of the natural log, but 
they arc also valid when the symbol In is replaced by log, 


RuleI (log of a product) In(zv) = Inw + Inv (u,v > O) 
in(ee) = Ine® +Ine*# =6+4=10 
In(Ae’) =InA+ Ine? =INnA+7 


Proor By definition, ln x is the power to which ¢ must be raised to attain the value of u; 
thus e'™ = 1.* Similarly, we have e'"" = v and e”) = wy. The latter is an exponential 
expression for wv. However, another expression of yy is obtainable by direct multiplication 
of wand u: 





Ine gine  yinatiny 


up=e = 
Thus, by cquating the two expressions for uv above, we find 

eine) = ginuting and hence In(uv) = Inu + Ine 
Rule (log of a quotient) In(u/v) = Inu —Iny (u,v > 0) 
* Asa mnemonic device, observe that when the symbol In (or log,) is placed at the left of the 
expression e*, the symbol In seems to cancel out the symbol e, leaving kas the answer. 


* Note that when eis raised to the power In u, the symbol e and the symbol In again seem to cancel 
out, leaving w as the answer. 


270 Part Four Optimization Problems 


Example 3 


Example 4 


Example 5 


Example 6 


Example 7 


Example 8 


In(e2/e) = Ine’ — Inc =2-Inc 
In(e2/e5) = Ine? —Ine® =2-5=-3 
The proof of this rule is very similar to that of Rule | and is therefore left to you as an 
exercise. 


Rule Ill (log of a power) Inu® =alnu (u> 0) 


Ine’ = 15 Ine=15 


In AF = 31nA 


Proor By definition, el = uy: and similarly, elt" — u@, However, another expression for 
uz can be formed as follows: 


uf = (eM) = et me 


By cquating the exponents in the two expressions for u", we obtain the desired result, 
Inu? Salou. 


These three tules are useful devices for simplifying the mathematical operations in certain 
types of problems, Rule I serves to convert, via logarithms, a multiplicative operation (1) 
into an additive one (In w + Inv); Rule TT turnsa division (w/v) intoa subtraction (Inv — In v), 
and Rule III enables us to reduce a power to a multiplicative constant. Moreover, these rules 
can be used in combination. Also, they can be read backward, and applied in reverse. 


Inv) = Inu+in¥’ =Inu+alny 


Inu+ainv=Inutin¥ =In(uv’) — [Example 7 in reverse] 
You are warned, however, that when we have additive expressions to begin with, loga- 
rithms may be of no help at all. In particulat, it should be remembered that 
Ina tv) #lnutlnv 


Let us now introduce two additional rules concerned with changes in the base of a 
logarithm. 


Rule 1V (conversion of log base) log, « = (log, elog,w) — (u > O} 


This rule, which resembles the chain rule in spirit (witness the “chain” 6 7°\,."), 
enables us to derive a logarithm log, u (to base ¢) from the logarithm log, « (to base 4), ot 
vice versa, 


Proor Let = e?, so that p = log, «. Then it follows that 
log, u = log,e? = plog, ¢ = (log, «)(log, e) 
Rule IV can readily be generalized to 
log, uv = (log, c)(log,, 2) 


where c is some base other than 6. 


Example 9 _ 


Chapter 10 Exponential and Logarithmic Functions 271 


1 
RuleV (inversion of log b log, ¢ = —— 
( ig base) B= eb 
This rule, which resembles the inversc-function nile of differentiation, enables us to 
obtain the log of 4 to the base e immediately upon being given the log of ¢ to the base b, 
and vice versa. (This rule can also be generalized to the form log, ¢ = L/ log. 6.) 


e 


PRrooF As an application of Rule IV, let = d; then we have 

log, 6 = (log, e)(Tog, b) 
But the left-side expression is log, 6 = 1; therefore log, ¢ and log, & must be reciprocal to 
each other, as Rule V asserts. 


From the last two rules, it is easy to derive the following pair of conversion formulas 
between common log and natural log: 
logiy N = (logiye)(log, NV} = 0.4343 log, N 


(10.14) 
log, N = (log, 10)(logy) NV) = 2.3026 logyy NV 


for N a positive real number. The first equals sign in each formula is casily justified by 
Rule IV. In the first formula, the value 0.4343 (the common log of 2.71828) can be found 
from a table of common logarithms or an electronic calculator; in the second, the value 
2.3026 (the natural log of 10) is merely the reciprocal of 0.4343, so calculated because of 
Rule V. 


log, 100 = 2.3026(/ag 49 100) = 2.3026(2) = 4.6052. Conversely, we have log,,100 = 
0.4343(log, 100) = 0.4343(4.6052) = 2. 


An Application 
The preceding rules of logarithms enable us to solve with case certain simple exponential 
equations (exponential fiwtctions set equal to zero). For instance, if we seck to find the value 
ofx that satisfies the equation 

ab*§-—c=0 (a, 5,c > 0) 


we can fizst try to transform this exponential equation, by the use of logarithms, into a 
finear equation and then solve it as such. For this purpose, the ¢ term should first be trans- 
posed to the right side: 

ab =c 
This is because there is no simple log expression for the additive expression (ab* — c), but 
there do cxist convenient log expressions for the multiplicative term @h* and for ¢ individ- 


ually. Thus, after the transposition of ¢ and upon taking the log (say, to base 10) of both 
sides, we have 


loga +x log = log ¢ 
which is a linear equation in the variable x, with the solution 
_ loge —loga 
~ logb 


272 Part Four Optimization Problems 





EXERCISE 10.3 


1, What are the values of the fatlowing logarithms? 


(a) log; 10,000 (9 log 81 

(b) log), 0.0001 (d) log, 3,125 
2. Evaluate the following: 

(a) Ine” (©) Int je?) (@ (23)! 

(b) fog, e~4 (a) log,(1 /e?) (Ff) Ine* — 
3. Evaluate the following by application of the rules of logarithms: 

(a) 1og;9(100)"3 (O InG/8) (2 InaBe* 

(6) loge ih (@) In Ae? (A) (log, e)(log, 64) 
4. Which of the following are valid? 

(@nu-2=In5 iC] Ing tiny iow = In 

() 3+inv= In (d)in3 +In5 =In8 


5, Prove that in(u/v) = Ina — Inv, 


10.4 Logarithmic Functions 





When a variable is expressed as a function of the logarithm of another variable, the func- 
tion is referred to as a /ogarithmic function. We have already seen two versions of this type 
of function in (10.12) and (10.13), namely, 


t= log, y and i = log, y (= Iny) 


which differ from each other only in regard te the base of the logarithm. 


Log Functions and Exponential Functions 

As we stated earlier, log functions are inverse functions of certain exponential functions. 
An examination of the previous two log functions will confirm that they are indeed the 
respective inverse functions of the exponential functions 


yoh and y=e 
because the log functions cited are the results of reversing the roles of the dependent and 
independent variables of the corresponding exponential functions. You should realize, of 
course, that the symbol ¢ is being used here as a general symbol. and it does not necessar- 
ily stand for time. Even when it does, its appearance as a dependent variable docs not mean 
that time is determined by some variable v; it means only that a given value of y is associ- 
ated with a unique point of time. 

As inverse functions of strictly increasing (exponential) functions, logarithmic functions 
must also be strictly increasing, which is consistent with our earlier statement that the 
larger a number, the larger is its logarithm to any given base. This property may be 





FIGURE 10.3 


y 


Chapter 10 Exponential and Logarithmic Functions 273 


expressed symbolically in terms of the following two propositions: For two positive values 


of y (yi, and ya), 
Iny, =Iny» ia Vi = 32 


JL > 2 


(10.15) 
Iny, > Inyo oS 


These propositions are also valid, of course, if we replace In by log,. 


The Graphical Form 

The monotonicity and other general properties of logarithmic functions can be clearly ob- 
served from their graphs. Given the graph of the exponential function » = e', we can ob- 
tain the graph of the corresponding log function by replotting the original graph with the 
two axes transposed. The result of such replotting is illustrated in Fig. 10.3. Note that if the 
graph of Fig. 10.36 were laid over the graph of Fig. 10.34, with v axis on y axis and f axis 
on taxis, the two curves should coincide exactly. As they actually appear in Fig. 10.3. with 
interchanged axes—on the other hand, the two curves are seen to be mirror images of each 
other (as the graphs of any pair of inverse functions must be) with reference to the 45° line 
drawn through the origin. 

This mircor-image relationship has several noteworthy implications. For one, although 
both are strictly increasing, the log curve shows y increasing at a decreasing rate (second 
derivative negative), in contradistinction to the exponential curve, which shows v increas- 
ing at an increasing rate. Another intercsting contrast is that, while the exponential function 
has a positive range, the log function has a positive demain instcad. (This latter restriction 





log,» (= ny) 




















274 Part Four Optimization Problems 


on the domain of the log function is, of course, merely another way of stating that only 
positive numbers possess logarithms.) A third consequence of the mirror-image relation- 
ship is that, just as y = ¢" has a vertical intercept at 1, the log function ¢ = log, y must 
cross the horizontal axis at y = 1, indicating that log, | = 0. Inasmuch as this horizontal 
intercept is unaffected by the base of the logarithm—for instance, log) | = 0, 1oo—we 
may infer from the general shape of the log curve in Fig. 10.36 that, for any base, 








devel logy <0 
y=l o log y =0 (10.16) 
yol logy > 0 


For verification, we can check the two sets of examples of common and natural logarithms 
given in Sec. 10.3. Furthermore, we may note that 


log y +> | =| ay > (* (10.167) 


The graphical comparison of the logarithmic function and the exponential function in 
Fig. 10.3 is based on the simple functions y = e‘ ands = Iny. The same general result will 
prevail if we compare the generalized exponential function y = de” with its correspond- 
ing log function. With the (positive) constants.d and r to compress or extend the exponen- 
tial curve, it will nevertheless resemble the general shape of Fig. 10.3a, except that its ver- 
tical intercept will be at y = 4 rather than at y = | (whens = 0, we have y = A= A), 
Its inverse function, accordingly, must have a horizoniaf intercept at y = A. In general, 
with reference to the 45° line, the corresponding log curve will be a mitror image of the 
exponential curve. 

If the specific algebraic expression of the inverse of y = Ae’ is desired, it can be 
obtained by taking the natural log of both sides of this exponential function [which, 
according to the first proposition in (10.15), will leave the equation undisturbed] and then 
solving for 7: 








Iny =In( de") =n At+rilne =nAt+ri 
Hence, 


Iny-Ind 
ta Er #0) (10.17) 
r 
This result, a log function, constitutes the inverse of the exponential function y = 4e" As 
claimed carlicr, the function in (10.17) has a horizontal intercept at y = 4, because when 
y = A, we have In y = In_ 4, and therefore ¢ = 0. 


Base Conversion 

In Sec. 10.2, it was stated that the exponential function y = 4d! can always be converted 
into a natural exponential function y = Ae”’. We are now ready to derive a conversion for- 
mula. Instead of 44’, however, let us consider the conversion of the more general expres- 
sion Ab" into Ae", Since the essence of the problem is to find an ¢ from given values of & 
and c such that 


=k 


Example 1 


Example 2 


Chapter 10 Exponential and Logarithmic Functions 275 


all that is necessary is to express r as a function of b and c. Such a task is easily aceom- 
plished by taking the natural log of both sides of the last equation: 


Ine” = Ind" 
The left side can immediately be read as equal to r, so that the desired function (conversion 
formula) emerges as 
r=lnb*=cinb (10.18) 


This indicates that the function y = 4)“ can always be rewritten in the natural-base form, 
y= Agloinby, 


Convert y = 2' to a natural exponential function. Here, we have A = 1, b=2, and c=1. 
Hence r = cln b= |n2, and the desired exponential function is 


ya Aelt = nar 


If we like, we can also calculate the numerical value of In 2 by use of (10.14) and a table of 
common logarithms as follows: 


In 2 = 2.3026 logy 2 = 2.3026(0.3010} = 0.6931 (10.79) 
Then we may express the earlier result alternatively as y = e° 6931¢, 
Convert y = 3¢5) to a natural exponential function. In this example, A= 3, b= 5, and 
c= 2, and formula (10.18) gives us r = 2 In 5. Therefore the desired function is 
y= Aett = 3e2in5yt 
Again, if we like, we can calculate that 
2In5 =In 25 = 2.3026 log, 9 25 = 2.3026(1.3979) = 3.2188 


so the earlier result can be alternatively expressed as y = 3¢22188t 


It is also possible, of course, to convert log functions of the form ¢ = log, y into equiv- 
alent natural log functions. To that end, it is sufficient 10 apply Rule [V of logarithms, which 
imay be expressed as 

log, ¥ = (log, e)(log, v) 
The direct substitution of this result into the given log function immediately gives us the 
desired natural leg function: 


t= log, y = (log, )(log, ») 


1 

= —— log, y [by Rule V of logarithms] 
log, b 

_iny 

~ Inb 





By the same procedure, we can transform the more general log function ? = a log,(cy) into 
the equivatent form 


a 
jog, 6 





f = a(log, e)(log, cv) = log Coy) = Ma Infey) 


276 Part Four Optimization Prublems 


Example 3 


Example 4 


Convert the function ¢ = log, y into the natural log form. Since in this example we have 
b= 2 anda=c=1, the desired function is 


f= iaginy 


By (10.19), however, we may also express it as t = (1/0.6931}In y¥- 


Convert the function t = 7 log,9(2y) into a natural logarithmic function. The values of the 
constants are in this case ¢= 7, b= 10, and c = 2; consequently, the desired function is 
7 
~ Into 
But since In 10 = 2.3026, as (10.14) indicates, this function can be rewritten as t = 
(7/2.3026) In(2y) = 3.0400In(2y). 


t In(2y) 


In the preceding discussion, we have followed the practice of expressing ¢ as a function 
of y when the function is logarithmic. The only reason for doing so is our desire to stress 
the inverse-function relationship between the exponential and logarithmic functions. When 
a log function is studied by itself we shall write y = Inz (rather than ¢ = Iny), 
customary. Naturally, nothing in the analytical aspect of the discussion will be affected 
by such an interchange of symbols. 


SiS 








EXERCISE 10.4 


1, The form of the inverse function of y = Ae” in (10.17) requires rto be nonzero. What 
is the meaning of this requirement when viewed in reference to the original exponen- 
tial function y = Ae"? 

2. (0) Sketch a graph of the exponential function y = Ae"t; indicate the value of the 

vertical intercept, 
(6) Then sketch the graph of the log function t = 
the horizontal intercept. 


3. Find the inverse function of y = ab, 
4, Transform the following functions to their natural exponential forms: 


Iny—InA , and indicate the value of 


(a) y= 8 (0) y= 505 
(b) y= 27" (d) y= 20s)" 
5. Transform the following functions to their natural logarithmic forms: 
(@) t=log,y (Q t= Flog sy) 
(b) t= logs(3y) (A) t= 2hogig y 


6. Find the continuous-compounding nominal interest rate per annum () that is equiva- 
lent to a discrete-compounding interest rate (/) of 


(@) 5 percent per annum, compounded annually, 
(6) 5 percent per annum, compounded semiannually, 
{c) 6 percent per annum, compounded semiannually, 
(d) 6 percent per annum, compounded quarterly. 


Chapter 10 Expanential and Logarithinic Functions 277 


7. (a) In describing Fig. 10.3, the text states that, if the two curves are laid over each 
other, they show a mirror-image relationship. Where is the “mirror” located? 

(b) if we plot a function f(x) and its negative, ~ f(x), in the same diagram, will the two 
curves display a mirror-image relationship, too? If so, where is the “mirror” located 
in this case? 

(o) Ifwe plotthe graphs of Ae" and Ae“ in the sare diagram, will the two curves be 
- mirror images of ‘gach: other? lf 50, where is the “mirror” located? 


10.5 Derivatives of Exponential and Logarithmic Functions 





Earlier it was claimed that the function e’ is its own derivative. As it turns out, the natural 
log function, In, possesses a rather convenient derivative also, namely, d(int)/dt = 1/1, 
This fact reinforces our preference for the base e. Let us now prove the validity of these two 
derivative formulas, and then we shall deduce the derivative formulas for certain variants 
of the exponential and log expressions e and In ft. 





Log-Function Rule 
The derivative of the log function y = Int is 
L 


fate 
dt ft 


To prove this, we recall that, by definition, the derivative of y = y(t) = Inf has the 
following value at? = N (assuming t 2 N+). 


vo a a = im Int - InN 








N in 
¥N)= im, = lim ~—, 
Ing {nN 
n m » [by Rule U! of logarithms] 
pas At g— 
N 1 m 
Now Ict us introduce a shorthand symbol m = ——.Then we can write 7 = —,and 

taN 1-N -N ON 





t 1 PaAaremer 
also y= =l4 = 1+ —, Thus the expression to the right of the limit sign in the 
m 





N 
previous equation can be converted to the form 
1 t m t 1 1\" 
1 =— In{l =— Inf l+— by Rule Ill of logari 
won "yay n( +) ¥ n(i+2) [by Rule TI] of logarithms] 


Note that, as + N*, m tends to infinity. Thus, to find the desired derivative value, we may 
take the limit of the last expression in the preceding equation as #1 00: 


1 1\" ] 1 
“N)= lim —infl4+—]) =—Ine=— . 
wun) = jin Si(14S) = Tine= E poy 10. 
Since N can be any number for which a logarithm is defined, however, we can generalize 


this result, and write y(t) = d(In¢)/dt = 1/t. This proves the log-function rule for 
to Nt. 


278 Part Four Qptinizarion Problems 


The case of + N~ needs some modifications, but the essence of the proof is similar. 
Now the derivative of = Int has the value 


VO VNY ji Y= 2D 
t-N owe Nt 
tim BNA = tien In\N/O 
Tah N-f aot Nf 
Let je =¢/(N - t); then L/(N — 0) = pe/t, and N/t = 1+(N —)/t = 1+ 1/je. These 
equations enable us to rewrite the expression to the right of the last limit sign in the 
preceding equation for y'(.N) as 


i N op 1 1 1\? 
Ina =e -)=-) - 
wary *in(1+ 1) (145) 


Ast — N~, jt > oo. Thus the desired derivative value is 


w(N) = lim 


tone 











I 1 

“N)= lim —Ine = — 
WON) = Ny Me = 

the same result as for the case of s + N+. This completes the proof of the log-function 

tule. Notice, once more, that in the proof process, no specific numerical values are 
employed, and the result is therefore generally applicable. 


Exponential-Function Rule 
The derivative of the function py = e! is 
d { if 
ae =? 
This result follows casily from the log-function rule. We know that the inverse function of 
the function y = e! ist = Iny, with derivative dt/dy = L/y. Thus, by the inverse-function 
rule, we may write immediately 
d, dy 1 1 ; 
at’ ~ di” atidy fy” 





The Rules Generalized 

The log-function and exponential-function rules can be generalized to cases where 
the variable ¢ in the expression e’ and In fis replaced by some fierction of 1, say, f(z). The 
generalized versions of the two rules are 


d d du 
2 / — ft fu) —et = ef 
a Pe [or aS ir 
; rw , Ma (10.20) 
i v 
_— = —hv=-— 
ht In f(y fo [o« i ny ; = 


The proofs for (10.20) involve nothing more than the straightforward application of the 
chain rule, Given a function y = e/, we can first let w= (1), so that y = e”. Then, by 
the chain rule, the derivative emerges as 


se = d a det du elt = ef P(t) 


dt a“ dudt dt 


Example 1 


Example 2 


Example 3 


Example 4 


Example 5 


Chapter 10. Exponential and Logarithmic Functions 279 


Similarly, given a function y = In f(#), we can first let v = f(r), so as to form a chain: 
v = Inv, where v = f (7). Then, by the chain rule, we have 
a dinvdy ldv_ 1 


v= = 


I -—= 
dt " dv dt ovdi f(t) 





d 
In f(t) = ae 
dt m fle) fy 
Note that the only real modification introduced in (10.20) beyond the simpler rules 
de! /dt = and d(\nt)/di = 1/¢ is the multiplicative factor f"(t), 


Find the derivative of the function y= e"'. Here, the exponent is rt = f(f), with f(t) =r; 
thus 
dy _d 


fa retsre 


dt dt 


ft 


Find dy/dt from the function y= e § In this case, f(t) = —t, so that f(D = —1. Asa result, 
dy d,, -t 


~ = Te '=-e 


dt dt 


Find dy/dt from the function y = In(at). Since in this case f(t) = at, with f() =a, the 
derivative is 


d qi 
qno= aa 


which is, interestingly enough, identical with the derivative of y = Int. 
This example illustrates the fact that a multiplicative constant for t within a log expres- 
sion drops out in the process of derivation. But note that, for a constant k, we have 
d d k 


Se intatint=* 
a yin | 


thus a multiplicative constant outside the log expression is still retained in derivation. 
Find the derivative of the function y = In t®. With f(f) = t© and f(t) = ct—!, the formula in 
(10,20) yields 


d ct 
— Inti = =- 


dt te 





Find dy/dt from y = B In #2. Because this function is a product of two terms @ and In ¢2, the 
product rule should be used: 


dy 3a) 2 2,4 3 
a t aint +(int al 
2e 
=f (3) + (In t7)¢32?) 
= 2074372 Int) — (Rule Ill of logarithms] 
= 270 +3ind 


280 Part Four Optimization Problems 


Example 6 


The Case of Base b 


For exponential and log functions with basc 6, the derivatives are 


fy =H inb [iri oy # | 
(10.21) 


d 


gi 8! = Tine 


Note that in the special case of base e (when 4 = ¢), we have In 6 = Ine = 1, so that these 
two derivatives reduce to the basic exponential-function rule (d/d/)e' = e" and the basic 
log-function rule (d/d#) iné = 1/¢, respectively. 

The proofs for (10.21) are not difficult, For the case of 6’, the proof is based on the 
identity b = e'’, which cnables us to write 


b= end ~ einb 


(We write 1 In 4, instead of In b/, in order to emphasize that f is not a part of the log 
expression.) Hence 


d 
oe _ feu =(Indye"*) oy (10.209) 


= (Ind)(d') = b! Inb 


To prove the second part of (10.21), on the other hand, we rely on the basic log property that 


1 
=(I log, ?) = =~ Int 
log, ¢ = (logy eM log, #) = FI 


which leads us to the derivative 


d loe,t dfil Int td int 1 fl 

— lof s-t—in = = —l[- 

a Ge Ning Inbdt Inb\t 
The more general versions of these two formulas are 


Sof = f(oer Ind 


f@ 1 


‘ (10.21) 
gy Oe tO = Fn) inb 


Again, it is seen that if b = ¢, then In b = 1, and these formulas reduce to (10.20). 


Find the derivative of the function y= 12! '. Here, b= 12, f(t) =1-t, and f(t) = —-1; 
thus 
dy Tet 
+a 
a (ay in12 


Higher Derivatives 
Higher derivatives of exponential and log functions, like those of other types of functions, 
are mercly the results of repeated differentiation. 


Example 7 


Example 8 


Example 9 


Chapter 10 Exponential and Logarithmic Functions 281 


Find the second derivative of y= b' (with b> 1). The first derivative, by (10.21), is 
y'() = BI Inb (where In bis, of course, a constant); thus, by differentiating once more with 
respect to t, we have 


yVO= <¥@ = (3") Inb=(b' Ind) Inb = b(n by? 


Note that y = 6 is always positive and In b (for b> 1) is also positive [by (10.16)]; thus 
y'(t) = bf In b must be positive. And y"(f), being a product of b' and a squared number, is 
also positive. These facts confirm our previous assertion that the exponential function y = b! 
increases monotonically at an increasing rate. 


Find the second derivative of y = Int. The first derivative is y’ = 1/t = t"'; hence, the sec- 
ond derivative is 

-1 

epee Tt 

ya t®=5 

Inasmuch as the domain of this function consists of the open interval (0, 00), y’ = 1/t must 

be a positive number. On the other hand, y’ is always negative. Together, these conclusions 

serve to confirm our earlier assertion that the log function y = Int increases monotonically 

at a decreasing rate. 


An Application 

One of the prime virtues of the logarithm is its ability to convert a multiplication into an 
addition, and a division into a subtraction. This property can be exploited when we are 
differentiating a complicated product or quotient of any type of functions (not necessarily 
exponential or logarithmic). 


Find dy/dx from 
= x 
Y= E341) 


Instead of applying the product and quotient rules, we may first take the natural log of both 
sides of the equation to reduce the function to the form 


In y = Inx? = In(x + 3) -- In(2x +1) 
According to (10.20), the derivative of the left side with respect to xis 


d . lay 
~ 4 = 0 
am (left side) vox 
whereas the right side gives 
2x 1 2 7x+6 





Lan ide) = = 
ay MON Se) = 3 Deed EDGED 


When the two results are equated and both sides are multiplied by y, we get the desired 
derivative as follows: 


dy _ 7x+6 
dx x(Qe+3)Qx4+1)" 
7x+6 xe x(7x +6) 





* Xe 3Qx+ 1 e+ 3N2e +) + 3Qx 412 


282 Part Four Optimization Problems 


Example 10 Find dy/dx from y = x%e""-©, Taking the natural log of both sides, we have 


Iny=alnx+Ine@~< = alnx+kx—c 


Differentiating both sides with respect to x, and using (10.20), we then get 


ldy a 
you x" 
and (A si)y=(Z4h)ater ‘ 


Note, however, that if the given function contains additive terms, then it may not be de- 
sirable to convert the function into the log form. 





EXERCISE 10.5 


1. Find the derivatives of: 


(y= es (dy =5e-" (g) y= x26 
(y= e Of y= eo? torre (hy y= axeb*t¢ 
y= (f) y = xe" 


2. (@) Verify the derivative in Example 3 by utilizing the equation In (at) = Ina+Int. 
(b) Verify the result in Example 4 by utilizing the equation In t* = cint. 
3. Find the derivatives of: 





2x 
(@) y=Ini785) (y=smesi? — y=tn(=) 
(b) y = In(at) (e) y=Inx=In(1 +x) (A) y= 5x4 Inx? 
© y=In(t+19) () y =In[xtl — x)°] 
4. Find the derivatives of: 
(@) y=5! (@Q yatP? (© y= lag;(8x? + 3) 
(b) y = log,(t+1) (d) y = logy 7x? (f) y= x? log, x 


5, Prove the two formulas in (10.21’). 

6, Show that the function V = Aet (with A,r > 0) and the function A= Ve (with 
V, 5 > 0) are both strictly monotonic, but in opposite directions, and that they are both 
strictly convex in shape (cf. Exercise 10.2-5). 

7. Find the derivatives of the following by first taking the natural log of both sides: 


x a 
OY RET () y= G24 Hem 


10.6 Optimal Timing 





What we have learned about exponential and log functions can now be applied to some 
simple problems of optimal timing. 


A Problem of Wine Storage 
Suppose that a certain wine dealer is in possession of a particular quantity (say, a case) of 
wine, which he can either sell at the present time {¢ = 0) for a sum of SK or else store for 


Chapter 10 Exponential and Logarithmie Functions 283 


some length of time and then sell at a higher value. The growing value (V) of the wine is 
known to be the following function of time: 


V=Ke [= K expt?) (10.22) 


so that ifr = 0 (sell now), then V = K. The problem is to ascertain when he should sell it 
in order to maximize profit, assuming the storage cost to be nil.t 

Since the cost of wine is a “sunk” cost—the wine is already paid for by the dealer-.and 
sifice storage cost is assumed to be nonexistent, to maximize profit is the same as maxi- 
mizing the sales revenue, or the value of V. There is one catch, however. Fach value of V 
corresponding lo a specific point of ¢ represents a dollar sum receivable at a different date 
and, because of the interest clement involved, is not directly comparable with the ¥ value 
of another date. The way out of this difficulty is to discount cach V figure to its present- 
value equivalent (the valuc at time ¢ = 0), for then all the V values will be on a comparable 
footing. 

Let us assume that the intcrest rate on the continuous-compounding basis is at the level 
of r. Then, according to (10.11), the present value of V can be expressed as 


AQ) = Ve" = Kee" = Kel (10.22") 


where 4, denoting the present value of V, is itself a function of f. Therefore our problem 
amounts to finding the value of ¢ that maximizes 4, 





Maximization Conditions 
The first-order condition for maximizing 4 is to have dA/dt = 0. To find this derivative, 
we can either differentiate (10.22') directly with respect to ¢, or do it indirectly by first 
taking the natural log of both sides of (10.22’) and then differentiating with respect to ¢. Let 
us illustrate the latter procedure. 

First, we obtain from ( 10.22’) the equation 


In A(t) =InK + Ine¥" = Ink + (6? =r) 


Upon differentiating both sides with respect lo £ we then get 


Vdd Via, 
Adt 2 
aA l 
or Baars) 
dt 2 
Since 4 # 0, the condition dA/dt = 0 can be satisfied if and only if 
Loin | L 
“fa = = 
st =r or yp, Er or sroeve 
Wi Pan 


This implies that the optimum length of storage time is 


vs 1 - ! 
TAQ Par? 


' The consideration of storage cost will entail a difficulty we are not yet equipped to handle. Later, in 
Chap. 14, we shall return to this problem. 


284 Part Four Optimization Problems 


FIGURE 10.4 


If = 0.10, for instance, then ¢* = 25, and the dealer should store the case of wine for 
25 years. Note that the higher the rate of interest (rate of discount) is, the shorter the 
optimum storage period will be. 

The first-order condition, 1/(2./f) = r, admits of an easy economic interpretation. The 
left-hand expression merely represents the rate of growth of wine value V, because from 
(10.22) 


vo@ : 
a =a,5 exp(t'/?) = xs exp(e'?}—[K constant] 
| ip : 
=K Ge") exp(t!/?) [by (10.20)] 
=(L-i2)y [by (10.22 
=(5 y (10.22)] 


so that the rate of growth of Vis indeed the left-hand expression in the first-order condition: 


dVidt 1 an 1 
et 
v 2 It 
The right-hand expression r is, in contrast, the tate of interest or the rate of compound- 
interest growth of the cash fund receivable if the wine is sold right away—an opportunity- 
cost aspect of storing the wine. Thus, the equating of the two instantaneous rates, as 
illustrated in Fig. 10.4, is an attempt to hold onto the wine until the advantage of storage 
is completely wiped out, i.e., to wait till the moment when the (declining) rate of growth of 
wine value is just matched by the (constant) interest rate on cash sales receipts. 
The next order of business is to check whether the value of * satisfies the second-order 
condition for maximization of 4. The second derivative of 4 is 


PA d fl d {1 in ar dd 
oa Sal =r) =45- (5 =r) + (5 ie 


Rate 







1 
v1 





(cate of growth of wine value) 


r (rate of interest 








Chapter 10 Exponential aud Logarithmic Functions 285 


But, since the final term drops out when we evaluate it at the equilibrium (optimum) point, 
where d4/dt = 0, we are left with 


aa d fl 1 _y; -A 
G4 ge (le_,\o yf dae). =A 
qe An (5 ryaayeg Wr 


In view that A > 0, this second derivative is negative when evaluated at f* > 0, thereby 
ensuring that the solution value ¢* is indeed profit-maximizing. 


A Problem of Timber Cutting 
A similar problem, which involves a choice of the best time to take action, is that of timber 
cutting. 

Suppose the value of timber (already planted on some given land) is the following 
increasing function of time; 

vow 

expressed in units of $1,000, Assuming a discount rate of r (on the continuous basis) and 
also assuming zero upkeep cost during the period of timber growth, what is the optimal 
time to cut the timber for sale? 

As in the wine problem, we should first convert V into its present valuc: 

Atty = Vet! = 2g 

thus Ind = In2%? + Ine = find rt st? n2—rr 
To maximize A, we must set dA /dé = 0. The first derivative is obtainable by differentiating 
In A with respect to ¢ and then multiplying by 4: 

1d4 1 


<r ind — 
aay? 
In2 
thus da =A (*; -r) 
dt 1t 
Since A # 0, the condition d4/dt = 0 can be met if and only if 
In2 In2 
—_= i= 
at o vi 2r 


Consequently, the optimum number of years of growth is 


pa (M2) 
mw 


Tt is evident from this solution that, the higher the rate of discount, the earlier the timber 
should be cut. 

To make sure that ¢* is a maximizing (instead of minimizing) solution, the second-order 
condition should be checked. But this will be left ta you as an exercise. 

In this cxample, we have abstracted ftom planting cost by assuming that the trees are 
already planted, in which case the (sunk) planting cost is legitimately excludable from con- 
sideration in the optimization decision. If the decision is not onc of when to harvest but one 
of whether or not to plant at all, then the planting cost (incurred ut the presens) must be duly 


286 Part Four Opumization Problems 


compared with the present value of the timber output, computed with 1 set at the optimum 

yalue 7”. For instance, ifr = 0.05, then we have 

a ( 0.6931 
0.10 





2 
) = (6.931) = 48.0 years 


and A= 2 331-8. 05(48.0) = (122.0222)e7249 
= 122,0222(0.0907) = $11.0674 (in thousands) 


$o only a planting cost lower than 4* will make the venture worthwhile—-again, provided 
that upkeep cost is nil. 





EXERCISE 10.6 


1. If the value of wine grows according to the function V = Ke2¥!, instead of as in 
(10.22), how long should the dealer store the wine? 


2. Check the second-order condition for the timber-cutting problem. 


3. Asa generalization of the optimization problem illustrated in the present section, show 
that: 
(a) With any value function V = f(t) and a given continuous rate of discount r, the 
first-order condition for the present value A of V to reach a maxirnum is that the 
rate of growth of V be equal to 7. 
(b) The second-order sufficient condition for a maximum really amounts to the stipu- 
lation that the rate of growth of ¥ be strictly decreasing with time. 


4. Analyze the comparative statics of the wine-storage problem. 


10.7 Further Applications of Exponential 
and Logarithmic Derivatives 





Aside from their use in optimization problems, the derivative formulas of Sec, 10,5 have 
further useful economic applications, 
Finding the Rate of Growth 


When a variable y is a function of time, y = f(r), its instantaneous rate of growth is 
defined as’ 





he dy/di — f'(t) _ marginal function 


= = 4 
y i) total function (10.23) 





But, from (10.20), we see that this ratio is precigely the derivative of In f(¢) = In y. Thus, 
to find the instantaneous rate of growth of a function of time f(t), we can— instead of 
differentiating it with respect to , and then dividing by (/)—simply take its natural log 


T If the variable £ does not denote time, the expression (dy/dt)/y is referred to as the proportional rate 
of change of y with respect to t. 


Example 1 


Example 2 


Chapter 10 Exponential and Logarithmic Functions 287 


and then differentiate In_f(#) with respect to time," This alternative method may turn out to 
be the simpler approach, if f(¢) is a multiplicative or divisional expression which, upon 
logarithm-taking, will reduce to a sum or difference of additive terms. 


Find the rate of growth of ¥ = Ae’!, where t denotes time. itis already known to us that the 
rate of growth of Vis r, but let us check it by finding the derivative of In V: 
InV=InA+rt Ine=InA+réi — [Aconstant] 
Therefore, 
d 
y= inv =04 Setar 


as was to be demonstrated. 


Find the rate of growth of y = 4". In this case, we have 
Iny=In4! = ¢In4 


d 
Hence Iny=In4 


Y= it 





This is as it should be, because el"* = 4, and consequently, y = 4° can be rewritten as 
y =e", which would immediately enable us to read (In 4) as the rate of growth of y. 


Rate of Growth of a Combination of Functions 

‘To carry this discussion a step further, let us examine the instantancous rate of growth of a 

product of two functions of time: 

w= f(t) 

y=uv where 
v= g(t) 


Taking the natural log ofv, we obtain 
Iny =Inw+Inv 


Thus the desired rate of growth is 


hy= “anys aug fine 

0 dt dt dt 

But the two terms on the right side are the rates of growth of u and u, respectively. Thus we 
have the rule 

Kaey =a thy (10.24) 


Expressed in words, the instantaneous rate of growth of a product is the swan of the instan- 
tancous rates of growth of the components. 

By a similar procedure, the rate of growth of a quotient can be shown to be the differ- 
ence between the rates of growth of the components (see Exercise 10.7-4): 


Pap Shum Pe (10.25) 


Tif we plot the natural log of a function f(t) against tin a two-dimensional diagram, the slope of the 
curve, accordingly, will tell us the rate of growth of f(t}, This provides the rationale for the so-called 
semilog scale charts, which are used for comparing the rates of growth of different variables, or the 
rates of growlh of the same variable in different countries. 


288 Part Four Optimization Problems 


Example 3 


Example 4 


If consumption C is growing at the rate a, and if population H (for “heads”) is growing 
at the rate 8, what is the rate of growth of per capita consumption? Since per capita 
consumption is equal to C/H, its rate of growth should be 


hem =fe tH =a 8 
Now consider the instantaneous rate of growth of a sumt of two functions of time: 


= w= f(0 
zautv where veg 


This time, the natural log will be 
Inz = In(u + ¥) [# Inu + Inv] 
Thus 
t= a nz = # In{u + v) 
dt dt 





1d 
at [by (10.20)] 


i , : 
= milf (A+) 


But from (10.23) we have r, = f(1)/f(d), so that (1) = f(t)r. = wry. Similarly, we 
have @'(?) = ury. Asa result, we can write the rule 
v 


ty Fo (10.26) 
u+uU ute 


Fury = 
which shows that the rate of growth of a sum is a weighted average of the rates of growth 
of the components, 

By the same token, we have (see Exercise 10.7-5) 
u v 
Py 
u-v uw 


Ky (10.27) 








Fae = 


The exports of goods of a country, G = G(f), has a growth rate of a/t, and its exports of 
services, $ = S(t), has a growth rate of b/t. What is the growth rate of its total exports? 
Since total exports is X(Q = G(d) + S(t), a sum, its rate of growth should be 

5 


tea Sng tar 
Lin ano x? 


G Sb Ga+ Sb 
=3 (+ e(}= 77 


Finding the Point Elasticity 

We have seen that, given y = /(1), the derivative of In y measures the instantaneous rate of 
growth of y. Now let us sec what happens when, given a function y = f(x), we differenti- 
ate In y with respect to In x, rather than to x. 





Example 5 


Chapter 10 dxponential and Logarithmic Functions 289 
To begin with, let us define » = Iny and v =Inx. Then we can observe a chain of 
relationship linking « to y, and thence to x and w as follows: 
y=Iny y=f@)  xsel™ =e! 
Accordingly, the derivative of in_p with respect to In x is 


d(iny) _ du _ du dy dx 
d(inx) dv dydx dv 
d dy\{d, ldy , lady dyx 
“ Iny psi = (Syn 
dy dx} \dv pdx ydx dxy 
But this expression is precis¢ly that of the point elasticity of the function. Hence we have 


established the gencral principle that, for a function y = f(x), the point elasticity of y with 
respect to x 18 











_ any) 


= ann) (10.28) 


Eye 


It should be noted that the subscript yx in this symbol is an indicator that y and x are the 
two variables involved and does not imply the multiplication of y and x, This is unlike the 
case of r(y»), where the subscript does denote a product. Again, we now have an alternative 
way of finding the point elasticity of a function by use of logarithms, which may often 
prove to be an casier approach, ifthe given function comes in the form of a multiplicative 
or divisional expression. 


find the point elasticity of demand, given that Q = k/P, where kis a positive constant. This 
is the equation of a rectangular hyperbola (see Fig. 2.84); and, as is well known, a demand 
function of this form has a unitary point elasticity at all points. To show this, we shall apply 
(10.28). Since the natural log of the demand function is 


In Q=Ink -InP 


the elasticity of demand (Q with respect to P) is indeed 


dln _ 
“4d in) > 





-1 or |éql=1 


‘The result in (10.28) was derived by use of the chain rule of derivatives. It is of interest 
that a similar chain tule holds for clasticities; ie., given a function » = g(w), where 
w = h(x), we have 


buy = Erwwx (10.29) 


The proof is as follows: 


dyw (=: dydwwx — dpx 
Eywe vx = 7 —}= sr SST Ey 
; dwy/\ drew) dwde yw dxy 





290 Part Four Optintization Problemy 





EXERCISE 10.7 


1. 


Find the instantaneous rate of growth: 
@ y=s? ( y= ab (@) y= 4/3! 
(b) y= at® (d) y= 240") 


. ff population grows according to the function 4 = Ho(2)" and consumption by the 


function C = Coe, find the rates of growth of population, of consumption, and of 
per capita consumption by using the natural log. 


. If yis related to x by y= x*, how will the rates of growth ry and r,, be related? 
. Prove that if y = u/v, where u = f(t} and v = g(0), then the rate of growth of y will 


be ry =r, — Fy, as shown in (10.25). 


. The real income y is defined as the nominal income Y deflated by the price level P. 


How is ry (for real income) related to ry (for nominal income)? 


6. Prove the rate-of-growth rule (10.27). 


11. 


. Given the demand function Qg = k/P", where k and n are positive constants, find the 


point elasticity of demand eg by using (10.28) (cf. Exercise 8.1-4). 


. (a) Given y = wz, where w = g(x) and z= A(x), establish that ey, = Sw + &2x- 


(b) Given y = w/v, where u = G(x) and v = A(2), establish that ey, = ux — evs. 


. Given y= f(x), show that the derivative d(log, y)/d(log, x)—log to base b rather 


than e—also measures the point elasticity ¢,,. 


, Show that, if the demand for money Mz is a function of the national income ¥ = ¥(2) 


and the interest rate / = i(f), the rate of growth of Mz can be expressed as a weighted 
sum of ry and F;, 
Tig = EMygy Ty eM Ii 


where the weights are the elasticities of My with respect to Y and j respectively. 
Given the production function Q = F(X, £), find a general expression for the rate of 
growth of.Q in terms of the rates of growth of K and t, 


Chapter | | 


The Case of More than 
One Choice Variable 





The probicm of optimization was discussed in Chap, 9 within the framework of an objective 
function with a single choice variable. In Chap. 10 the discussion was extended to exponen- 
tial objective functions, but we still dealt with one choice variable only. Now we must develop 
a way of finding the extreme values of an objective function that involves two or more choice 
variables. Only then will we be able to tackle the type of problem confronting, say, a multi- 
product firm, where the profit-maximizing decision consists of the choice of optimal output 
levels for several commoditics and the optimal combination of several different inputs. 

We shall discuss first the case of an objective function of two choice variables, 
z= f(x, y), in order to take advantage of its graphability. Later the analytical results can 
be generalized to the nongraphable n-variable case. Regardless of the number of variables, 
however, we shall assume in general that, when written in a gencral form, our objective 
function possesses continuous partial derivatives to any desired order. This will ensure the 
smoothness and differentiability of the objective function as well as its partial derivatives. 

For functions of several variables, extreme values are again of two kinds: (1) absolute or 
global and (2) relative or local. As before, our attention will be focused heavily on relative 
extrema, and for this reason we shall often drop the adjective “relative,” with the under- 
standing that, unless otherwise specified, the extrema referred to are relative, However, in 
Sec. 11.5, conditions for absolute extrema will be given due consideration. 








11.1 The Differential Version of Optimization Conditions 





The discussion in Chap. 9 of optimization conditions for problems with a single choice 
variable was couched entirely in terms of derivatives, as against differentials. To prepare 
for the discussion of problems with two or more choice variables, it would be helpful also 
to know how those conditions can equivalently be expressed in terms of differentials. 


First-Order Condition 
Given a function z = f(x), we can, as explained in Sec. 8.1, write the differential 
dz= filx)dx (11.1) 
291 


292 Part Four Optimization Problems 


FIGURE 11.1 





dy 
Gy) B 
de = O for nonzero ax 








and use dz as an approximation to the actual change, Az, induced by the change of x from 
Xp to xo + Ax; the smaller the Ax, the better the approximation. From (11.1), itis clear that 
if f(x) > 0, dz and dx rust take the same algebraic sign; this is illustrated by point 4 in 
Fig. 11.1 (ef, Fig, 8.15), In the opposite case where f(x) < 0, exemplified by point A’, dz 
and dx take opposite algebraic signs. Since points like 4 and 4’—where f"(x) # 0 and 
hence dz # 0—cannot qualify as stationary points, it stands to reason that a necessary 
condition for z to attain an extremum (a stationary value) is dz = 0. More accurately, the 
condition should be stated as “dz = 0 for an arbitrary nonzero dr,” since a zcro dx (no 
change in x) has no relevance in our present context. In Fig. 11.1, a minimum of z occurs at 
point 8, and a maximum of z occurs at point B’. In both cases, with the tangent line hori- 
zontal, i.c., with f(x) = 0 there, dz (the vertical side of the triangle formed with the tan- 
gent line as the hypotenuse) indeed reduces to zero. Thus the first-order derivative condi- 
tion “/'(x) = 0” can be translated into the first-order differential condition “dz = 0 for an 
arbitrary nonzero dx.” Bear in mind, however, that while this differential condition is 
necessary for an extremum, it is by no means sufficient, because an inflection point such as 
Cin Fig. 11.1 can also satisfy the condition that dz = 0 for an arbitrary nonzero dr. 


Second-Order Condition 

The second-order sufficient conditions for extrema of’ z are, in terms of derivatives, 
£'(e) <9 (for a maximum) and f"(x) > 0 (for a minimum) at the stationary point. To 
translate these conditions into differential equivalents, we need the notion of second-order 
differential, defined as the differential of a differential, i.c., d(dz), commonly denoted by 
a?z (read: “d-two 2"). 

Given that dz = f(x) dx, we can obtain d?z merely by further differentiation of dz. In 
so doing, however, we should bear in mind that dx, representing in this context an arbitrary 
or given nonzero change in x, is to be treated as a constant during differentiation. Conse- 
quently, dz can vary only with f'(x), but since f(x) is in turn a function of x, dz can in the 
final analysis vary only with x, In view of this, we have 


Pz =ddz) =d[f'(x)dx) [by (ILD) 
=[df'(x)] dx [dx is constant] 
={f"(x) drjdx =f") de? (11.2) 


Chapter 11 The Case of More than One Choice Variable 293 


Note that the exponent 2 appears in (11.2) in two fundamentally different ways, In the sym- 
bol dz, the exponent 2 (read; “two") indicates the second-order differential of'z; but in the 
symbol dx? = (dx)?, the exponent 2 (read: “squared”) denotes the squaring of the first- 
order differential dx. The result in (11.2) provides a dircct link between dz and f(x). 
Inasmuch as we are considering nonzero values of dx only, the dx? term is always positive; 
thus d?z and f(x) must take the same algebraic sign. Just as a positive (negative) f(x) at 
a stationary point delineates a valley (peak), so must a positive (negative) dz at such a 
point, 

Tt follows that the derivative condition “ f(x) < 0 is sufficient fora maximum of 2” can 
equivalently be stated as the differential condition “d?= < 0 foran arbitrary nonzero dvis suf- 
ficient for a maximum of.” The translation of condition fora minimum of zis analogous; we 
just need to reverse the sense of inequality in the preceding sentence. Going one step further, 
we may also conclude on the basis of (1 1.2) that the second-order necessary conditions are 


For maximum ofz: f(x} $ 0 


For minimumef z:  f"{x) > 0 
can be translated, respectively, into 


Formaximumofz: @z<0|. 
for arbitrary nonzero values of dx 


For minimum ofz:  d?z >0 


Differential Conditions versus Derivative Conditions 

Now that we have demonstrated the possibility of expressing the derivative version of first- 
and second-order conditions in terms of dz and d?z, you may very well ask why we both- 
ered to develop a new set of differential conditions when derivative conditions were already 
available. The answer is that differential conditions—but not derivative conditions—are 
stated in forms that can be directly generalized from the one-variable case to cases with two 
or more choice variables. To be more specific, the first-order condition (zero value for dz) 
and the second-order condition (negativity or positivity for d?z) are applicable with equal 
validity to all cascs, provided the phrase “for arbitrary nonzero values of efx” is duly modi- 
fied to reflect the change in the number of choice variables. 

This does not mean, however, that derivative conditions will have no further role to play. 
To the contrary, since derivative conditions are operationally more convenient to apply, we 
shall—after the generalization process is carried out by means of the differential conditions 
to cases with more choice variables—still attempt to develop and make use of derivative 
conditions appropriate to those cases, 


11.2 Extreme Values of a Function of Two Variables 





For a function of one choice variable, an extreme value is represented graphically by the 
peak of a hill or the bottom of a valley in a two-dimensional graph. With Avo choice vari- 
ables, the graph of the function—z = f(x, )—becomes a surface in a 3-space, and while 
the extreme values are still to be associated with peaks and bottoms, these “hills” and 
“valleys” themselves now take on a three-dimensional character. They will, in this new 


294 Part Four Opiimization Prublems 


FIGURE 11.2 





@ (by 


context, be shaped like domes and bowls, respectively. The two diagrams in Fig. 11.2 serve 
to illustrate, Point 4 in diagram a, the peak of a dome, constitutes a maximum; the value of 
zat this point is larger than at any other point in its immediate neighborhood. Similarly, 
point B in diagram 4, the bottom of a bowl, represents a minimum; everywhere in its 
immediate neighborhood the value of the function excecds that at point B. 


First-Order Condition 
For the function 
z= f(x,y) 
the first-order necessary condition fer an extremum (cither maximum or minimum) again 


involves dz = 0. But since there are two independent variables here, dz is now a total 
differential; thus the first-order condition should be modified to the form 


dz = 0 for arbitrary valucs of dx and dy, not both zero (11,3) 


The rationale behind (11.3) is similar to the explanation of the condition dz = 0 for the 
one-variable case: an cxtremum point must be a stationary point, and at a stationary point, 
dz as an approximation to the actual change Az must be zero for arbitrary dx and dv, not 
both zero. 

In the present two-variable case, the total differential is 





dz = frdxt fray (11.4) 


In order to satisfy condition (11.3), it is nccessary-and-sufficient that the two partial deriv- 
atives f, and f, be simultaneously equal to zero. Thus the equivalent derivative version of 
the first-order condition (11.3) is 





dz dz 
0 —=-—=0 11.5) 
[> ax dy ] (15) 


There is a simple graphical interpretation of this condition. With reference to point 4 in 
Fig. 11.2, to have f, = 0 at that point means that the tangent line 7;, drawn through 4 and 


Chapter 11 The Case of More than One Choice Variable 295 


FIGURE 11.3 





(a) (8) 


parallel to the xz plane (holding y constant), must have a zero slope. By the same token, to 
have f, = 0 at point 4 means that the tangent line 7. drawn through 4 and parailel to the 
yz plane (bolding x constant), must also have a zero slope, You can readily verify that these 
langent-line requirements actually also apply to the minimum point 8 in Fig. 11.2. This is 
because condition (11.5), like condition (11.3), is a necessary condition for both a maxi- 
mum and a minimum, 

As in the earlier discussion, the first-order condition is necessary, but not sufficient, That 
it is not sufficient to cstablish an extremum can be seen from the two diagrams in Fig. 11.3. 
At point Cin diagram a, both 7, and 7, have zero slopes, but this point does not qualify as 
an extremum: Whereas it is a minimum when viewed against the background of the yz 
plane, it turns out to be a maximum when looked at against the xz plane! A point with such 
“dual personality” is referred to, for graphical reasons, as a saddle point, Similarly, point 
D in Fig. 11.34, while characterized by flat 7, and 7,, is no extremum, cither; its location 
on the twisted surface makes it an iaflection point, whether viewed against the xz or the vz 
plane. These counterexamples decidedly rule out the first-order condition as a sufficient 
condition for an extremum. 
To develop a sufficient condition, we must look to the second-order total differential, 
which is related io second-order partial derivatives. 


2 


Second-Order Partial Derivatives 











The function z = f(x, +) can give rise to two first-order partial derivatives, 
az az 
— and =— 
hes fy iy 


Since f, is itself a function of x (as well as of y), we can measure the rate of change of f, 
with respect to x, while y remains fixed, by a particular second-order (or second) partial 
derivative denoted by either f, or °z/dx?: 
a. ez a faz 
seth or Ss eEtls 
fh anf ) ax? ax (5) 
The notation fr: has a double subscript signifying that the primitive function f has been 
differentiated partially with respect to x twice, whereas the notation @7:/d.x? resembles that 





296. Part Four Optimization Problems 


Example 1 


Example 2 


of d?z/dx? except for the usc of the partial symbol. In a perfectly analogous manner, we 
can use the second partial derivative 

_ oO faz 

~ ay \ ay 


to denote the rate of change of , with respect to y, while x is held constant. 
Recall, however, that f, is also a function of p and that f, is also a function of x. Ilence. 
there can be written two more second partial derivatives: 


#z 





ate 
~ Op ax 
These are called cross (or mixed} partial derivatives because each measures the rate of 
change of one first-order partial derivative with respect to the “other” variable. 

It bears repeating that the second-order partial derivatives of z = f(x, y), like z and the 
first derivatives f, and f,, are also functions of the variables x. and y. When that fact re- 
quires emphasis, we can write fix as fix, y}, and fry a8 fiy(x, yp), ote, And, along the 
same line, we can usc the nofalion /,,{1, 2) to denote the value of f,, evaluated at x = 1 
and y = 2, etc. 

Even though /,, and /,, have been separately defined, they will—according to a propo- 
sition known as Youngs theorem- have identical valucs, as long as the two cross partial de- 
rivatives are both continuous. In that case, the sequential order in which partial differentia- 
tion is undertaken becomes immaterial, because f.. = fy,. For the ordinary types of 
specific functions with which we work, this continuily condition is usually met; for general 
functions, as mentioned earlier, we always assume the continuity condition to hold. Ience, 
we may in general expect to find identical cross partial derivatives. In fact, the theorem ap- 
plies also to functions of three or more variables. Given z = g(u, v, w). for instance, the 
mixed partial derivatives will be characterized by gy) = Sis Bow = Baw, Cte. provided 
these partial derivatives are all continuous. 





and Jes 








Find the four second-order partial derivatives of 
z= x74 5xy— y? 
The first partial derivatives of this function are 
= 3x°+5y — and fy =5x-2y 
Therefore, upon further differentiation, we get 
fre 6X tye = 5 fy = 5 fy = ~2 


As expected, f,, and fry are identical. 


Find all the second partial derivatives of z= xe". in this case, the first partial derivatives are 
fy=2xe’ and fp xe 
Thus we have 
Fey = 20% fx = ~ 2xe Y fyy = xe? fy = vey 


Again, we see that fyx = fy. 


Example 3 


Chapter 11. The Case of More than One Choice Variable 297 


Note that the second partial derivatives are all functions of the original variables x and y. 
This fact is clear enough in Example 2, but it is true even for Example 1, although some 
second partial derivatives happen to be constant functions in that case. 


Second-Order Total Differential 


Given the total differential dz in (11.4), and with the concept of second-order partial deriv- 
atives at our command, we can derive an expression for the second-order total differential 
d°z by further differentiation of dz, In so doing, we should remember that in the equation 
dz = f,dx + f, dy, the symbols dx and dy represent arbitrary or given changes in x and py; 
so they must be treated as constants during differentiation. As a result, dz depends only on 
J, and f,, and since f, and f, are themselves functions of x and y, dz, like z itself, is a 
funetion of x and y. 

To obtain d*z, we morely apply the definition of a differential—as shown in (11.4)—10 
dz itself. Thus, 
a(dz) a(dz) 


d 
oe tay 


a a. - 
= Praca dx + fy dy) dx + at dx + f.dy)dy 


2 = d(dz) = 








dy — [ef (1L.4)] 


= (fer dx + fay dy) dx + fx dx + fy dy) dy 
= fog A+ fry dy dx + fran dx dy + fy dy” 
= fede +2fydxdy+fydy’ — Uhy=Sol (11.6) 


Note, again, that the exponent 2 appears in (11.6) in two different ways. In the symbol dz, 
the exponent 2 indicates the second-order total differential of z; but in the symbol 
dx? = (dx)’, the exponent denotes the squaring of the first-order differential dy. 

The result in (11.6) shows the magnitude of dz in terms of given values of dx and dy, 
measured from some point (x, 9) in the domain. In order to calculate d?z, however, 
we also need to know the second-order partial derivatives fry, fey, and yy, all evaluated 
at (Xo, Yo}—just as we need to know the first-order partial derivatives to calculate dz 
from (11.4). 


Given z= x? + 5xy— y?, find dz and d?z. This function is the same as the one in Example 1. 
Thus, substituting the various derivatives already obtained there into (11.4) and (11.6), 
we findt 


dz= (3x? + Sy) dx + (Sx — 2y) dy 


t An alternative way of reaching these results is by direct differentiation of the function: 
dz= d(x?) + d(Sxy) - d(y?) 
= 3x? dx + Sy dx + 5x dy —2y dy 
Further differentiation of dz (bearing in mind that dx and dy are constants) will then yield 
P z= d3x?) dx + d(Sy) dx + d(5x) dy — d(2y) dy 
= (6x dx) dx + (5 dy) dx + (5 dx) dy ~ (2 dy) dy 
= 6x dx? +10 dx dy —2 dy? 


298 Parl Four Optimization Problems 


and 
dz = 6x dx? +10 dx dy—2 dy? 
We can also calculate dz and dz at specific points in the domain. At the point x = 1 and 
y = 2, for instance, we have 
dz=13dx+dy and = d#@z=6 dx? + 10 dx dy—2 dy” 


Second-Order Condition 

In the one-variable casc, 7: < Oat astationary point identifies the point as the peak of a hill 
ina 2-space. Similarly, in the two-variable case. a?z < Oata stationary point would identify 
the pointas the peak ofa dome ina 3-space. Thus, once the first-order necessary condition is 
satisfied, the second-order sufficient condition for a maximum of z = f(x, v) is 


@z <0 for arbitrary values of dx and dy, not both zero, (11.7) 


A positive dz value at a stationary point, on the other hand, is associated with the bottom 
of a bow). The second-order sufficient condition for ¢ minimum of z = f(x, ¥) is 





@z > 0 for arbitrary values of dx and dy, not both zero (11.8) 


The reason why (11.7) and (11.8) are only sufficient, but not necessary, conditions is 
that it is again possible for d*z to take a zero value at a maximum or a minimum. For 
this reason, second-order necessary conditions must be stated with weak inequalities as 
follows: 


For maximum of z: 
For minimum of z: 





for arbitrary values of dx and dy. not both zero 
(11.9) 


In the following, however, we shall pay more attention to the second-order sufficient 
conditions, 

For operational convenience, second-order differential conditions can be translated into 
cquivalent conditions on second-order derivatives. In the two-variable case, (11.6) shows 
that this would entail restrictions on the signs of the sccond-order partial derivatives 
fey, fey, and fy. The actual translation would require a knowledge of quadratic forms, 
which will be discussed in Sec. 11.3. But we may first introduce the main result here: For 
any values of efx and dy, not both zero, 

pile? iff fe <0 fly < Or and fot > Ay 
“(20 iff fee Oo fy> Oh and fs fe > Ay 
Note that the sign of d?= hinges not only on f,, and fy,, which have to do with the surface 
configuration around point 4 (Fig. 11.4) in the two basic directions shown by 7, (east-west) 
and T, (north-south), but also on the cross partial derivative f.,. The role played by this lat- 
ter partial derivative is to ensure that the surface in question will yield {two-dimensional} 
cross sections with the same type of configuration (hill or valley, as the case may be) not 
only in the two basic directions (east-west and north-south), but in all other possible diree- 
tions (such as northeast-southwest) as well. 


Chapter 117 fhe Cuse of More than One Choice Variable 299 


FIGURE 11.4 

















| 


0 % x 


This result, together with the first-order condition (11.5), enables us to construct 
Table 11.1. It should be understood that all the second partial derivatives therein are to 
be evaluated at the stationary paint where f, = f, = 0. It should also be stressed that 
the second-order sufficient condition is aof necessary for an extremum. In particular, if a 
stationary value is characterized by fix fy =, pe in violation of that condition, that sta- 
tionary value may nevertheless turn out to be an extremum. On the other hand, in the case 
of another type of violation, with a stationary point characterized by fis fy» <2, we can 
identify that point as a saddle point, because the sign of d?z will in that case be indefinite 
(positive for some values of dx and dy, but negative for others). 





Example 4 Find the extreme value(s) of 2= 8x? 4 2xy — 3x? + y? +1, First let us find all the first and 
—————__ second partial derivatives: 
fe= 24x24 2y—6x fy = 2x+2y 
fee =48K-6 fy 2 fy =2 
The first-order condition calls for satisfaction of the simultaneous equations f, =0 and 
f, = 0; that is, 
24x? + By 6x = 0 
2y+2x=0 





The second equation implies that y= —x, and when this information is substituted into the 
first equation, we get 24x? - 8x = 0, which yields the pair of solutions 


=O] 





x{=0 [implying yf = 
gay [impiying yp= -4] 
To apply the second-order condition, we note that, when 


aayjid 


300 Part Four Optimization Problems 


TABLE 11.4 
Conditions for 
Relative 
Extremum: 
t= fey) 


Example 5 





Condition Maximum Minimum 
First-order necessary condition fee fp nO fy fy=O 
Seconckorder sufficient condition’ has fy <0 fra, fy > 0 
and and 
tcf > BB fer fy > 





* Applicable only after the firsteonter necessary condition has been satisfied, 


f,x turns out ta be —6, while fyy is 2, so that f,x fyy is negative and is necessarily tess than 
a squared value . This fails the second-order condition. The fact that f,. and fyy have 
opposite signs suggests, of course, that the surface in question will curl upward in one 
direction but downward in another, thereby giving rise to a saddle point. 

What about the other solution? When evaluated at x} = 4, we find that fy. = 10, which, 
together with the fact that fy = fy = 2, meets all three parts of the second-order sufficient 
condition for a minimum. Therefore, by setting x = } and y = - j in the given function, we 
can obtain as a minimum of zthe value z* = B. In the present example, there thus exists only 
one relative extremum (a minimum), which can be represented by the ordered triple 

1-1 23 
(x, y, Z) = (+. “37 =) 
Find the extreme value(s) of z= x + 2ey — e* — e?’. The relevant derivatives of this function 
are 





f=l-e — fy=2e~2e¥ 
fos —e fy =—4e¥ fy = 0 
To satisfy the necessary condition, we must have 
]~e*=0 
2e-- 2e7=0 
which has only one solution, namely, x* = 0 and y* = 7 To ascertain the status of the value 
of Z corresponding to this solution (the stationary value), we evaluate the second-order de- 
rivatives atx = O and y= b and find that fx = —1, fyy = —4e, and fy = 0. Since t,x and 
fyy are both negative and since, in addition, (-1)(—4e) > 0, we may conclude that the z 
value In question, namely, 


z=O+e-&-el =-1 


is a maximum value of the function. This maximum point on the given surface can be 
denoted by the ordered triple (x*, y*, 2) ={@, $, -1). 

Again, note that, to evaluate the second partial derivatives at x° and y", differentiation 
must be undertaken first, and then the specific values of x* and y* are to be substituted into 
the derivatives as the final step. 





EXERCISE 11.2 


Use Table 11.1 to find the extreme value(s) of each of the following four functions, and 
determine whether they are maxima or minima: 

Liza any tay s3 

2.2=-x? - y+ 6x4 2y 


Chapter 11 The Case of More than One Choice Fariahle 301 


3. z= ax? + by? +c; consider each of the three subcases: 
(a)a>0,b>0 () a<0,b<0 {Q) @and b opposite in sign 
4. zee 2x4 243 
5. Consider the function z= (x — 2)4 — (y~ 3). 
(a) Establish by intuitive reasoning that z attains a minimum (z* = 0) at x" = 2 and 
y'=3. 
(b) Is the first-order necessary condition in Table 11.1 satisfied? 
(c) Is the second-order sufficient condition in Table 11.1 satisfied? 
{d) Find the value of d’z. Does it satisfy the second-order necessary condition for a 
minimum in (11.9)? 


11.3. Quadratic Forms—An Excursion 





The expression for dz on the last tine of (1 1.6) exemplifies what are known as quadratic 
for which there exist established criteria for determining whether their signs ate 
osilive, negative, nonpositive, or nonnegative, for arbitrary values of dy and dy, not 
both zero. Since the second-order condition for extremum hinges directly on the sign of 
az, those criteria are of direct interest. 

To begin with, we define a form as a polynomial expression in which each component 
term has a unifotm degree, Our earlier encounter with polynomials was confined to the 
case of a single variable: ay + ax +---+a,x". When more variables are involved, each 
term of a polynomial may contain either one variable or several variables, each raised to a 
nonnegative imeger power, such as 3x + 4x?)° — 2yz. In the special case where cach term 
has a uniform degree—i,e,, where the sum of exponents in cach term is uniform—the 
polynomial is called a form. For exampie, 4x — 9y + 2 is a linear form in three variables. 
because each of its terms is of the first degree. On the other hand, the polynomial 
4x? — xy + 3y*, in which each term is of the second degree (sum of integer exponents = 2), 
constitutes a quadratic form in two variables. We may also cncounter quadratic forms in 
three variables, such as x? + 2xv — yw — Ti, or indeed in » variables. 





Second-Order Total Differential as a Quadratic Form 


If we consider the differentials dv and dy in (11.6) as variables and the partial derivatives as 
coefficients, i.e., if we let 


(11.10) 





then the second-order total differential 
Pos fis de t2fi dx dv + fry dy? 
can easily be identified as a quadratic form g in the two variables u and v: 


q sau? + Thuy £ be? (11.6) 


302 Part Four Optimization Problems 


Note that, in this quadratic form, dy = u and dy =v are cast in the role of variables, 
whereas the second partial derivatives are treated as constants—the exact opposite of the 
situation when we were differentiating dz to get d*z. The reason for this role reversal lies in 
the changed nature of the problem we are now dealing with. The second-order sufficient 
condition for extremum stipulates d?z to be definitely positive (for a minimum) and defi- 
nitely negative (for a maximum), regardless of the values that dx and dy may take (so long 
as they are not both zero). It is obvious, therefore, that in the present context dr and dy must 
be considered as variables. The second partial derivatives, on the other hand, will assume 
specific values at the points we are examining as possible extremum points, and thus may 
be regarded as constants. 

The major question becomes, then: What restrictions must be placed upon a, b, and h 
in (11,6'), while w and v are allowed to take any values, in order to ensure a definite sign 
for q? 


Positive and Negative Definiteness 
Asa matter of terminology, let us remark thal a quadratic form q is said to be 


Positive definite positive (> 0) 
Positive semidefinite tp nonnegative (2 0) 
s abl : ~ 
Negative semidefinite if is invariably nonpositive (= 0) 
Negative definite negative (<0) 


regardless of the values of the variables in the quadratic form, nol all zero. [fy changes 
signs when the variables assume different values, on the other hand, g is said (o be indefinite. 
Clearly, the cases of positive and negative definiteness of g = dz are related to the second- 
order sufficient conditions for a minimum and a maximum, respectively. The cases of 
semidefiniteness, on the other hand, relate to second-order necessary conditions. When 
q =4°z is indefinite, we have the symptom of a saddle point. 


Determinantal Test for Sign Definiteness 
A widely used test for the sign definiteness of g calls for the examination of the signs of cer- 
tain determinants. This test happens to be more easily applicable to positive and negative 
definiteness (as against semidefinitencss), that is, it applies more casily to second-order 
sufficient (as against necessary) conditions, We shall confine our discussion here to the 
sufficient conditions only.t 

For the two-variable case, determinantal conditions for the sign definiteness of g are 
relatively easy to derive. In the first place, we see that the signs of the first and third terms 
in (11.6') are independent of the values of the variables « and v, because these variables 
appear in squares. Thus it is easy to specify the condition for the positive or negative defi- 
niteness of these terms alone, by restricting the signs of a and h. The trouble spot lics in 
the middle term. But if we can convert the entire polynomial into an expression such that 
the variables wv and v appear only in some squares, the definiteness of the sign of q will 
again become tractable. 


* For a discussion of a determinantal test for second-order necessary conditions, see Alpha C. Chiang, 
Elements of Dynamic Optimization, Waveland Press Inc., 1992, pp. 85-90. 


Chapter 11 The Case of Mote than One Choice Variable 303 


The device that will do the trick is that of completing the square. By adding A?v2/a 
to, and subtracting the same quantity from, the right side of (11.6’), we can rewrite the 
quadratic form as follows; 

a ma 
q =aw + Thuy + aa + by? — ae 


2h Ae RB’ , 
=a (@ + aut ‘~) + (- *) ve 
a a a 


AN ab-# 
=+( +3) +5 (0?) 


a 





Now that the variables # and v appear only in squares, we can predicate the sign of 
g entircly on the values of the coefficients a, b, and # as follows: 


i positive definite iff a>0 
negative definite a<0 


| and ab—h?>O (11,11) 


Now that (1) ab — A? should be positive in both cases and (2) as a prerequisite for the pos- 
itivity of 2b — h°, the product a must be positive (since it must exceed the squared term 
A), hence, this condition automatically implies that @ and 5 must take the identical alge- 
braie sign. 

The condition just derived may be stated more succinctly by the use of determinants. We 
observe first that the quadratic form in (11.6') can be rearranged into the following square, 
symmetric format: 


q= atu’) + h(uv) 
+ Ayu) + 6(v) 


with the squared terms placed on the diagonal and with the 2hav term split into two equal 
parts and placed off the diagonal. The coefficients now form a symmetric matrix, with a 
and } on the principal diagonal and / off the diagonal, Viewed in this light, the quadratic 
form is also easily seen to be the | x | matrix (a scalar) resulting from the following matrix 


multiplication: 
a Alfu 
4=[h allt Als 


Note that this is a more generalized case of the matrix product x’ Ax discussed in Sec. 4.4, 
Example $. In that example, with a so-called diagonal matrix (a symmetric matrix with 
only zeros as its off-diagonal elements) as A, the product x’ Ax represents a weighted sum 
of squares. Here, with any symmetric matrix as 4 (allowing nonzero off-diagonal elements 
to appear), the product x‘ 4x is a quadratic form. 


i 3 |i is referred to as the 
discriminant of the quadratic form g, and which we shall therefore denote by |D|—supplies 
the clue to the criterion in (11,11), for the latter can be alternatively expressed as: 


is | eee eta | iff {rey 


The determinant of the 2 x 2 coefficient matrix, 





ah 


| and hob 








negative definite lal <0 >0 (11.11) 


304 Part Four Optimization Problems 


Example 1 


Example 2 


The determinant |a| = @ is simply the first feading principal minor of |D|. The deter- 
ah 
hh 
present case, thete are only two leading principal minors available, and their signs will 
serve to determine the positive or negative definiteness of g. 

When (11.11’) is translated, via (11.10), into terms of the second-order total differential 
dz, we have 


minant is, on the other hand, the second leading principal minor of |D|. In the 








fax hy 


Loder ~ fig > 0 
fay fy) Pele fo 





bei positive definite| [fxr = 0 and 
negative definite tex <9 








Recalling that the latter inequality implies that fi, and f,y are required to take the same 
sign, we see that this is precisely the second-order sufficient condition presented in 
Table 11.1, 

In general, the discriminant of a quadratic form 


q= au? + Qhuv + dy? 


ak 


is the symmetric determinant hob 


. In the particular case of the quadratic form 








Wr = fry dx? +2 fy dy dy + fry dy? 


the discriminant is a determinant with the second-order partial derivatives as its elements. 
Such a determinant is called a Hessian determinant (or simply a Hessian). In the two- 
variable casc, the Hessian is 


tas Sry 
Ive Soy 
which, in view of Young's thearem (ft, = fix), is symmetric—as a discriminant should 
be. You should carefully distinguish the Hessian determinant from the Jacobian determi- 
nant discussed in Sec. 7.6. 


lA = 








Is q=5u’ + 3uv+2v* either positive or negative definite? The discriminant of q is 


[? 15! 


132)" with leading principal minors 





5 15|_ 
520 and 3s 2 j7 7750 


Therefore g is positive definite. 


Given fy = —2, fyy = 1, and fy = —1 at a certain point on a function z= f(x, y), does 
a? zhave a definite sign at that point regardless of the values of dx and dy? The discriminant 


4 4 : with leading principal minors 


—2 41 
1 -1|> 


of the quadratic form ¢?zis in this case 





—2<0 and | 


Thus dz is negative definite. 


Chapter 11 The Case of More than One Choice bariuble 305 


Three-Variable Quadratic Forms 
Can similar conditions be obtained for a quadratic form in three variables? 
A quadratic form with three variables w, v2, and uy may be gencrally represented as 


gym, 43) = dua(ud) + dia(uyue) + dis(ujus) 
+ ds) (uguy) + dy (u3) + dox(uaH) 
+ dls) (3441) + dya(uaia) + das (u3) 


3.3 
=) Vajuu, (11.12) 


i=l 7=l 


where the double-5~ (double-sum) notation means that both the index é and the index j are 
allowed to take the values 1, 2, and 3; and thus the double-sum expression is equivalent to 
the 3 x 3 array shown in Eq, (11.12). Such a square array of the quadratic form is, inci- 
dentally, always to be considered a symmetric one, even though we have written the pair of 
coefficients (dj, d21) or (d33, d32) as if the two members of each pair were different. For if 
the term in the quadratic form involving the variables #) and wz happens to be, say, |2ui, 
we can let diz = dy) = 6, so that drow) uz = dzyu2uy, and a similar procedure may be ap- 
plied to make the other off-diagonal elements symmetrical. 

Actually, this three-variable quadratic form is again expressible as a product of three 
matrices: 


dy dy dis || 
g(a, tr Ws) = [ey te us) | dy dy dy | | a | = aDu (11.12') 
4) 1 dys | Ls 


As in the two-variable case, the first matrix {a row vector) and the third matrix (a column 
vector) merely list the variables, and the middle one (D) is a symmetric coefficient matrix 
from the square-array version of the quadratic form in (11.12). This time, however, a total 
of three leading principal minors can be formed from its discriminant, namely, 





dy dy dj 

dy, a no 420 43 

[DiS an ipa |" ty) (l= ld da den 
ds) dy Cis 


where |D;| denotes the ith leading principal minor of the discriminant | D|." It turns out that 
the conditions for positive or negative definiteness can again be stated in terms of certain 
sign restrictions on these principal minors. 

By the now-familiar device of completing the square. the quadratic form in (11.12) can 
be converted into an expression in which the three variables appear only as components of 


* We have so far viewed the ith leading principal minor | D;| as a subdeterminant formed by retaining 
the first { principal-diagonal elements of | D|. Since the notion of a minor implies the deletion of 
something érom the original determinant, however, you may prefer to view the ith leading principal 
minor alternatively as a subdeterminant formed by deleting the last (m — 7} rows and columns of |DI. 


306 Part Four Optimization Problems 


Example 3 


Example 4 


some squares, Specifically, recalling that @2 = a2), ele. we have 





dy dy ) dydy — dy ( dudes — dads ) 
= dy |) + ey + ay] + =u u 
4 uf Tn dy ; dy 2 didn — a2, 3 


n diidadss — didi, — dead}, — dysdiy + 2dirdiadnn 
ddr, — dy 


2 





(u3) 


This sum of squares will be positive (negative) for any values of uw), u2, and ws, not all zero, 
if and only if the coeflicients of the three squared expressions are all positive (negative). 
But the three cocfficients (in the order given) can be expressed in terms of the three lead- 
ing principal minors as follows: 


{D3| Ds] 


Pl Ta Bal 








Hence, for positive definiteness, the necessary-and-sufficient condition is threefold: 
Di > 
Di|>0 given that |D,| > 0 already 
Ds) >0 given that |Ds| > 0 already 


Tn other words, the three leading principal minors must all be positive. For negative 
definiteness, on the other hand, the necessary-and-sufficient condition becomes: 


D,| <0 
D> 0 given that {| < 0 already 





Ds| <0 given that |D2| > 0 already 


That is, the three leading principal minors must alternate in sign in the specified manner. 





Determine whether q = uf + 6u3 + 3u§ — 2uu2 — 4Uzus is either positive or negative defi- 
nite, The discriminant of gis 











1-1 #06 
-1 6 -2 
0-2 3 
with leading principal minors as follows: 
14 1-1 0 
1>0 |; e|-5=9 and -1 6 -2/=li>0 
0 -2 3 








Therefore, the quadratic form is positive definite. 


Determine whether q = 2u? + 3v? ~ w? + 6uv — uw — 2w is either positive or negative 


2 3 -4 
definite. The discriminant may be written as| 3 3 —1], and we find its first leading 
-4 -1 -1 








principal minor to be 2 > 0, but the second leading principal minor is ; ; =-3<0. 





Chapter 11 The Case of More than One Choice Variable 307 


This violates the condition for both positive and negative definiteness; thus q is neither 
positive nor negative definite. 


n-Variable Quadratic Forms 
As an extension of the preceding result to the n-variable case, we shall state without proo! 
that, for the quadratic form 


non 
gti t,t) = YS dijuin,; [where dj = dj] 


j=l ji 
=u pou (ef. (11.12) 
(baa) taxn) QD<1) 


the necessary-and-sufficient condition for positive definiteness is that the leading principal 
minors of ||, namely, 








dq dys diy 

dy dy dy dy vs ds 

lsd) [DI= DJs yer Be ” 
Plea IDS la ge al EE 
day dro dan} 


all be positive. The corresponding necessary-and-sufficient condition for negative definite- 
ness is that the leading principal minors alternate in sign as follows: 


ID] <0 |D2| > 0 |D3| <0 (cte.) 


so that all the odd-numbered ones are negative and all even-numbered ones are positive. 
The nth leading principal minor, | 9, = |D[, should be positive ifm is even, but negative if 
nis odd. This can be expressed succinctly by the inequality (—1)"|D,| > 0. 


Characteristic-Root Test for Sign Definiteness 

Aside from the preceding determinantal test for the sign definiteness of a quadratic form 
u'Du, there is an alternative test that utilizes the concept of the so-called characteristic 
roots of the matrix D. This concept arises in a problem of the following nature. Given an 
nx matrix D, can we find a scalar 7, and an x x | vector x 4 0, such that the matrix 
equation 





D x =r x (11.13) 
(nxn) CXD) tax) 
is satisfied? If so, the scalar r is referred to as a characteristic root of matrix D and x as a 
characteristic vector of that matrix. 
The matrix cquation Dy = rx can be rewritten as Dx — rix = 0, or 


(D-rfjx=0 — where Oisn x 1 (11.13’) 


* Characteristic roots are also known by the alternative names of fatent roots, or eigenvatues. 
Characteristic vectors are also called eigenvectors. 


308 Part Four Optimization Problems 


Example 5 


This, of course, represents a system of » homogeneous linear equations. Since we want a 
nontrivial solution for x, the coefficient matrix (D — r/}—called the characteristic matrix 
of D—is required to be singular. Tn other words, its determinant must be made to vanish: 


dy-r dy 





diy |_9 (11.14) 





Equation (11.14) is called the characteristic equation of matrix D, Since the determinant 
|D —r1| will yield, upon Laplace expansion, an nth-degree polynomial in the variable r, 
(11.14) is in fact an ath-degree polynomial cquation. There will thus be a total of » roots, 
(71, .-2; 7m), each of which qualifies as a characteristic root. [f D is symmetric, as is the 
case in the quadratic-form context, the characteristic roots wil) always Lurn out Lo be real 
numbers, but they cart take either algebraic sign, or be zero. 

Inasmuch as these values of r will all make the determinant | — r!| vanish, the substi- 
tution of any of these (say, 7;) into the equation system (11.13') will produce a correspond- 
ing vector x|»,. More accurately, the system being homogeneous, il will yield an infinite 
number of vectors corresponding to the root /;. We shall, however, apply a process of nor- 
malization (to be explained in Example 5) and select a particular member of that infinite sct 
as the characteristic vector corresponding to r;; this vector will be denoted by v,. With a 
total of n characteristic roots, there should be a total of # such corresponding characteristic 
vectors. 


Find the characteristic roots and vectors of the matrix [3 


4] . By substituting the given 


matrix for D in (11.14), we get the equation 


2-1 2 
20 -1- 





with roots r; = 3 and rz = —2. When the first root is used, the matrix equation (11.13') 


takes the form of 
2-3 2 ]fn]_f-1 2]fn]_fo 
2 1-3},a] | 2 -4] Lx] Lo 


The two rows of the coefficient matrix being linearly dependent, as we would expect in 
view of (11.14), there is an infinite number of solutions, which can be expressed by the 
equation x1 = 2x2. To force out a unique solution, we normalize the solution by imposing 
the restriction x? + x5 = 1." Then, since 


xP $x = (2x0)? x9 = 5x5 = 


we can obtain (by taking the positive square root) x2 = 1/V5, and also x; = 2x; = 2/V5. 
Thus the first characteristic vector is 


Le 


0 
t More generally, for the n-variable case, we require that ) 
i) 





Chapter 11 The Case of More than One Choice Variable 309 


Similarly, by using the second root rz = —2 in (11.13"), we get the equation 
2-(-2) 2 xn] _[4 2][m]_[o 
2 -1-(-2)][ mJ ~ 12 T]Le2]~ Lo 
which has the solution x; = ~ty. Upon normalization, we find 
2 
4a (-3) + =$g=1 
which yields x2 = avs and x) = -1V/5. Thus the second characteristic vector is 


n=[ 08 | 


The set of characteristic vectors obtained in this manner possesses two important prop- 
erties: First, the scalar product vty (i = 1,2,...,) must be equal to unity, since 


vin = [xp mp yl {by normalization] 





Xn 


Second, the scalar product v}v) (where i # /) can always be taken to be zero.” In sum, 
therefore, we may write that 


yy=l and vy =0 (i ¢s/) (11.15) 


These properties will prove usefill later (see Example 6). As a matter of terminology, when 
two vectors yield a zero-valued scalar product, the vectors are said to be orthogonal 
(perpendicular) to each other.' Hence each pair of characteristic vectors of matrix D must 
be orthogonal. The other property, u;v; = L, is indicative of normalization. Together, these 
two properties account for the fact that the characteristic vectors (v1, ..., v,) are said to 





1 To demonstrate this, we note that, by (11.13), we may write Dy; = rjv;,and Dy; = ry. By 
premultiplying both sides of each of these equations by an appropriate row vector, we have 

viDy = Virjy = ryMyy [rj is a scalar] 

yoy yi anv anv [yy = yu) 
Since yj Dv; and v; Dy; are both 1 x 1, and since they are transposes of each other (recall that 0’ = D 
because D is symmetric), they must represent the same scalar. It fotlows that the extreme-right 
expressions in these two equations are equal; hence, by subtracting, we have 

(j —n)yy =0 

Now if r;  r (distinct roots), then vjv; has to be zero in order for the equation to hold, and this 
establishes our claim. If r; = r; (repeated roots), moreover, it will always be possible, as it turns out, to 
find two linearly independent normalized vectors satisfying v;vj = 0. Thus, we may state in general 
that v'v; = 0, whenever i # j. 
* Asa simple illustration of this, think of the two unit vectors of a 2-space, e) = [a] and e; = [‘]. 
These vectors lie, respectively, on the two axes, and are thus perpendicular, At the same time, we do 
find that e’e2 = ee = 0. See also Exercise 4.3-4. 





310 Part Four Optimization Problems 


Example 6 


be a set of orthonormal vectors. You should try to verify the orthonormality of the two 
characteristic vectors found in Example 5. 

Now we are ready to explain how the characteristic roots and characteristic vectors of 
matrix ) can be of service in determining the sign definiteness of the quadratic form x’ Dx. 
Tn essence, the idea is again to transform w’ (uw (which involves not only squared terms 
ui, ..., 22, bul also cross-product terms such as #22 and w2 45) into a form that contains 
only squared terms. Thus the approach is similar in intent to the completing-the-square 
process used before in deriving the determinantal lest. However, in the present case, the 
transformation possesses the additional feature that each squared term has as its coefficient 
one of the characteristi¢ roots, so that the signs of the # roots will provide sufficicnt infor- 
mation for determining the sign definiteness of the quadratic form. 

The transformation that will do the trick is as follows. Let the characteristic vectors 


Uy, -.., Uy constitute the columns of a matrix T: 








Tay mo by] 
taxm) 
and then apply the transformation uw = 7 Y_ tothe quadratic form u' Du: 


(ext) (axa) (4x1) 


w Du =(Ts'D(Ty) =y'TDTy — [by 4.10] 
=\Ry where R=T'DT 


Asa result, the original quadratic form in the variables u; is now turned into another qua- 
dratic form in the variables y,. Since the u; variables and the y, variables take the same 
range of values, the transformation does not affect the sign definiteness of the quadratic 
form. Thus we may now just as well consider the sign of the quadratic form y'Ry instead, 
What makes this latter quadratic form intriguing is that the matrix R will tum out to be a 
diagonal one, with the roots "|, ....7 of matrix D displayed along its diagonal, and with 
zetos everywhere else, so that we have in fact 


r, 0 0 vi 
wDu=yRy=ty vd] 8 OS ° 
0 90 red | yp 

aryttny too try (11.16) 


which is an expression involving squared terms only, The transformation R = 7’DT 
provides us, therefore, with a procedure for diagonatizing the symmetric matrix D into the 
special diagonal matrix R. 


Verify that the matrix [; 3] given in Example 5 can be diagonalized into the matrix 


[‘ r = [3 5] . On the basis of the characteristic vectors found in Example 5, the 
transformation matrix T should be 
_ _ fas -1Vv5 
T=[-y wl=[T)¥%5 V5 


Chapter 11 fhe Case of More than One Choice Fariable 311 


Thus we may write 





201 2 4 
, V5 V5lf2 21/ v5 v5! 73 0 
Retot=| yy [; S| 1 2 [0 4] 
“Vs 5 JS 5 


which duly verifies the diagonalization process. 


To prove the diagonalization result in (11.16), let us (partially) write out the matrix R as 
follows: 


v 
R=MDT =|? | Dp wm Hy] 
Uy, 

We may casily verify that D[vy v. --- v,] can berewrittenas(Dv, Doz --- Dry]. 
Besides, by (11,13), we can further rewrite thisas[rim rat, +> PyU,]. Hence, we see 
that 

vy PVD] FQVYV. == PyU;Uy 

vy, ryphyy rouyog ce ryvhy, 

R=] 0 Jim mty oe at) = 2 u“ man 

mi PVE Ui. rata 

[rn 0 - 0 

On: 0 

=|. ° : [by (11.15)] 
LO Or 





which is precisely what we intended to show. 
Tn view of the result in (11.16), we may formally state the characteristic-root test for the 
sign definiteness of a quadratic form as follows; 


1. ¢ =u' Du is positive (negative) definite, if and only if every characteristic root of D is 
positive (negative). 

2. ¢ =u’ Du is positive (negative) semidefinite, if and only if ai? characteristic roots of D 
are nonnegative (nonpgsitive), 

3. q = u' Du is indefinite, if and only if some of the characteristic roots of D are positive 
and some are negative. 


Note that, in applying this test, all we necd are the characteristic roots; the characteristic 
vectors are not required unless we wish to find the transformation matrix 7. Note, also, that 
this test, unlike the determinantal test previously outlined, permits us to check the second- 
order necessary conditions (part 2 of the test) simultaneously with the sufficient conditions 
(part | of the test), However, it does have a drawback, When the matrix D is of a high di- 
tension, the polynomial equation (11.14) may not be easily solvable for the characteristic 
roots needed for the test. [In such cases, the determinantal test might yet be preferable. 


312° Part Four Optimization Problems 





EXERCISE 11.3 
1. 


By direct matrix multiplication, express each of the following matrix products as a 
quadratic. form: 


ow af 1] om aff a)fi] 


ow aly a] or olf 14] 


, In Prob.1b and c, the coefficient matrices are not symmetric with respect to the prin- 


cipal diagonal. Verify that by averaging the off-diagonal elernents and thus converting 


them, respectively, into [ 3 4 and [3 al we will get the same quadratic forms 


as before. 


. On the basis of their coefficient matrices (the symmetric versions), determine by the 


determinantal test whether the quadratic forms in Prob.1a, b, and c are either positive 
definite or negative definite. 


. Express each of the following quadratic forms as a matrix product involving a symmet- 


ric coefficient matrix: 


(@) g= 30 —4uv 4 7¥? (d) q = 6xy — Sy? — 2x? 
(b) qu? + Juv + 3v? (0) q = 3ub — Qtr + Aur uy + Sud + 4u§ — 2upuy 
() q=8uv-w - 317 (f) q= -u? + 4uy ~ 6uw - 4v? — 7? 

. From the discriminants obtained from the symmetric coefficient matrices of Prob. 4, 


ascertain by the determinantal test which of the quadratic forms are positive definite 
and which are negative definite. 


. Find the characteristic roots of each of the following matrices: 


42 -2 2 53 
w@o=|) 4 me=[% 4 oF =|} al 
What can you conclude about the signs of the quadratic forms uw’ Du, u’Eu, and u’ Fu? 
(Check your results against Prob. 3.) 


. Find the characteristic vectors of the matrix [: i . 


, Given a quadratic form u Du, where Dis 2 x 2, the characteristic equation of D can be 


written as 
a-r a 
=0 ‘ha = d, 
dy dt (a2 = day} 
Expand the determinant; express the roots of this equation by use of the quadratic 
formula; and deduce the following: 
(ay No imaginary number (a number involving Y—1) can occur in 7: and r2. 


(b) To have repeated roots, matrix’D must be in the form of [ ‘|. 


(¢ To have either positive: or negative semidefiniteness, the discriminant of the qua- 
dratic form may vanish, that is, |O| = 0.is possible. 


Chapter 11 The Case of More than Gne Choice Variable 313 


11.4 Objective Functions with More than Two Variables 





When there appear in an objective function n > 2 choice variables, it is no longer possible to 
graph the function, although we can still speak of a Aypersurface in an (n + 1)-dimensional 
space. On such a (nongraphable) hypersurface, there again may cxist (n + 1)-dimensional 
analogs of peaks of domes and bottoms of bowls. How do we identify them? 


First-Order Condition for Extremum 
Let us specifically consider a function of three choice variables, 


z= f(s, 22, x3) 


with first partial derivatives /\, f2,and fs and second partial derivatives f; (= 2 /dx,0x;), 
with ?, j = 1,2, 3. By virtue of Young’s theorem, we have fi; = fi. 

Our earlier discussion suggests that, to have a maximum or a minimum of z, it is neces- 
sary that dz = 0 for arbitrary values of dx), dx2, and dx, not all zero. Since the value of" 
dz is now 


dz = fi dx, + frdxy + frdxy (11.17) 





and since dx), dx», and dx; are arbitrary changes in the independent variables, not all zero, 
the only way to guarantee a zero dzis to have fi = f2 = fy = 0. Thus, again, the necessary 
condition for extremum ig that all the first-order partial derivatives he zero, the same as for 
the two-variable case," 


Second-Order Condition 
The satisfaction of the first-order condition earmarks certain values of z as the stationary 
values of the objective function. If at a stationary value of z we find that dz is positive def- 
inite, this will suffice to establish that value of z as a minimum, Analogously, the negative 
definiteness of d?z is a sufficient condition for the stationary value to be a maximum. This 
raises the questions of how to express @7z when there are three variables in the function and 
how to determine its positive or negative definiteness. 

The expression for dz can be obtained by differentiating dz in (11.17). In such a 
process, as in (11.6), we should treat the derivatives /; as variables and the differentials dx; 


* As a special case, note that if we happen to be working with a function z= f(x1, x2, x3) implicitly 
defined by an equation F(z, x1, x2, x3) = 0, where 


az -8F/dK 
= LEK G43 
Pog ope YO 123) 


then the first-order condition f, = f = fs = 0 will amount to the condition 


ar oF _ oF =0 


ax x2 OKs 


since the value of the denominator 4F /9z + 0 makes no difference. 


314 Part Four Optimization Problems 


as constants. Thus we have 
d: r] ald. 
Pe = d(dz) = 0 gay HA? ay, 4 1 
ox Ox Oxy 


a 
= Ili day + friday + fides) dx 
1 








adxy 


a... 
+ rome dx, + frdxy t+ fy dx3) dxr 
Oxy 


a - 
+ inl Aids, + frduy + fidrs) diy 
x3 


= fir dx} + firdxy dxy t+ fiz dxy dx3 
+ fy dxz dx) + fads} + frdrrdxs 
+ fay dxy dy + fag dixg dxy + fog dx? (11.18) 


which is a quadratic form similar to (1.12), Consequently, the criteria for positive and neg- 
ative definiteness we learned earlier are directly applicable here. 

In determining the positive or negative definiteness of d°z, we must again, as we did in 
(11.6'), regard dx; as variables that can take any values (though not all zero), while consid- 
ering the derivatives fi; as coefficients upon which to impose certain restrictions. The 
coefficients in (11.18) give tise to the symmetric Hessian determinant 


fia fia fa 


IHl=\ hr fe fs 
Bi ffs 

whose leading principal minors may be denoted by 
fa fia 
fa fa 


Thus, on the basis of the determinantal criteria for positive and negative definiteness, we 
may state the second-order sufficient condition for an extremum of z as follows: 


7 maximum 
isa}. 
minimum 





IMl= fi VAl= |] = 14] 








{Ki <0; [Al > 0: [Aj] <0 (dz negative definite) (1.49) 


Ail =O: [Hal > 0; |Ayl > 0 (d?z positive definite) 


In using this condition, we must evaluate all the leading principal minors at the stationary 
point where fi = fp = fy = 0. 
We may. of course, also apply the characteristic-root test and associate the positive defi- 
niteness (negative definiteness) of dz with the positivity (negativity) of all the characteristic 
tu fe fis 
roots of the Hessian matrix | fy, fo. jf | In fact, instead of saying that the second- 
fa fa fo 
order total differential d?z is positive (negative) definite, it is also acceplable to state that 
the Hessian matrix H (10 be distinguished from the Hessian determinant |H]|) is positive 
(negative) definite. In this usage, however, note that the sign definiteness of H refers to the 








Example 1 


Example 2 


Chapter 11 The Case of More than One Choice Variable 315 


sign of the quadratic form d°z with which H is associated, not to the signs of the elements 
of H per se. 


Find the extreme value(s) of 
2 Ox + pha $43 + ag +h +2 

The first-order condition for extremum involves the simultaneous satisfaction of the follow- 
ing three equations: 

Ch =)4x1+ 2+ 3 =0 

(f=) x1 + 8X2 =0 

(f=) 9 + 2x3 =0 
Because this is a homogeneous linear-equation system, in which all the three equations are 
independent (the determinant of the coefficient matrix does not vanish), there exists only the 
single solution x* = x} = x} = 0. This means that there is only one stationary value, z* = 2. 

The Hessian determinant of this function is 


fy fia fis 411 
IM|=)f faz fy}=|1 8 0 
fi fz fa} [1 0 2 








whose leading principal minors are all positive: 
[My] =4 [Aa] = 31 |Aa| = 54 


Thus we can conclude, by (11.9), that z* = 2 is a minimum, 


Find the extreme value(s) of 
2= —xP + Bax1ny + xg — 3 — 3xg 
The first partial derivatives are found to be 
fs—-34+3x 0 f=2-2x, fh =3m-6y 
By setting all f; equal to zero, we get three simultaneous equations, one nonlinear and two 
linear: 
3x3 43x53 =0 
— 2K =-2 
3 — 6x3 =0 
Since the second equation gives x3 = 1 and the third equation implies x; = 2x}, substitu- 
tion of these into the first equation yields two solutions: 
(0,1, 0), implying z+ = 1 


1X9, XB) = 
(xf, X30 3) (.1.4), implying z* = 


The second-order partial derivatives, properly arranged, give us the Hessian 
6x) 0 3 


lH[=| 0 -2 0 
3 0 -6 


316 Part Four Optimization Problems 


in which the first element (—6x;) reduces to 0 under the first solution (with x7 = 0) and to 
3 under the second (with x# = }). It is immediately obvious that the first solution does 
not satisfy the second-order sufficient condition, since [f;| = 0. We may, however, resort to 
the characteristic-root test for further information. For this purpose, we apply the charac- 
teristic equation (11.14). Since the quadratic form being tested is d?z, whose discriminant 
is the Hessian determinant, we should, of course, substitute the elements of the Hessian for 
the dj; elements in that equation. Hence the characteristic equation is (for the first solution) 


aad 0 3 
0 -2-r 0 
3 0 -6-96 


" 
oO 


which, upon expansion, becomes the cubic equation 

P48r? 4 3r-18=0 
Using Theorem | in Sec, 3.3, we are able to find an integer root —2, Thus the cubic function 
should be divisible by (r + 2), and we can factor the cubic function and rewrite the pre- 
ceding equation as 

(7 +207? + 6r—9) =0 
Itis clear from the (r + 2) term that one of the characteristic roots is r1 = —2. The other two 
roots can be found by applying the quadratic formula to the other term; they are 
2 =-34 4072, and ry = -3 - 4.72. Inasmuch as r; and r3 are negative but r2 is posi- 
tive, the quadratic form az is indefinite, thereby violating the second-order necessary 
conditions for both a maximum and a minimum z. Thus the first solution (z* = 1) is not an 


extremum at all. 
As for the second solution, the situation is simpler. Since the leading principal minors 


IMl=-3 |Ae]=6 and JH] = -18 


duly alternate in sign, the determinantal test is conclusive. According to (11.19), the solu- 
tion z* = 7 isa maximum. 


n-Variable Case 
When there are x choice variables, the objective function may be expressed as 
2 f(x1 Nay Xn) 
The total differential will then be 
dz= fidx) t+ fadxy tet fadry 

so that the necessary condition for extremurt (dz = 0 for arbitrary dx;, not all zero) means 
that all the 7 first-order partial derivatives are required to be zero. 

The second-order differential d’z will again be a quadratic form, derivable analogously 


to (11.18) and expressible by an » x # array. The coefficients of that array, properly 
arranged, will now give the (symmetric) Hessian 


fir fix oe Sin 
|Al= fy fn - fan 


Var 





In 


TABLE 11.2 
Determinantal 
Test for 
Relative 
Extremum: 7 = 


Sy Xp 4X) 


Chapter 11 The Case of More than One Choice Variable 317 





Condition Maximum Minimum 

First-order necessary condition fy = fy---= f= 0 f= he f=0 

Second-order sufficient |Hil < 0; |A2| > 0; (Ay EMail... [Hal > 0 
conditiont [Hg <05...5(=1)"Mal > 0 





| Applicable only ster te first-order necessary condition has been satisfied, 


with leading principal minors ||, | £1, ..., |fy|, a8 defined before. The second-order suf- 
ficient condition for extremum is, as before, that all those principal minors be positive (fora 
minimum in z) and that they duly alternate in sign (for a maximum in z), the first one being 
negative. 

In summary, then-- if we concentrate on the determinantal test-- we have the criteria as 
listed in Table (1.2, which is valid for an objective function of any number of choice vari- 
ables. As special cases. we can have n = 1 orn = 2. When a = [, the objective function is 
z= f(x), and the conditions for maximization, f, = and |Aj| < 0, reduce to 
A(x) = O and f(x} < 0, exactly as we learned in Scc. 9.4. Similarly, when n = 2, the 
objective function is z= f(x), x2}. so that the first-order condition for maximum is 
fi = fo = 0, whereas the second-order sufficient condition becomes 


Ar fis 
fo far 


which is merely a restatement of the information presented in Table 11,1, 





fi<0 and = fufa-fi>o 











EXERCISE 11.4 


Find the extreme values, if any, of the following four functions. Check whether they are 
maxima or minima by the.determinantal test. 


«a P+ 3x3 — Ba Xo + Axa Ks + OXF 
. 2= 29 (xP ~ af +38) 
2 XpX + XP — Xp ot x9K3 + x5 + 3X3 
rae 4 e%4 0 (2x4 20" —y) 
Then answer the following questions regarding Hessian matrices and their characteristic 
roots. 
5. (a) Which of Probs. 1 through 4 yield diagonal Hessian matrices? In each such case, do 
the diagonal elements possess a uniform sign? 
(8) What can you conclude about the characteristic roots of each diagonal Hessian 
matrix found? About the sign definiteness of d?z? 
(©) Do the results of the characteristic-root test check with those of the determinantai 
test? 
6. {a) Find the characteristic raots of the Hessian matrix for Prab. 3. 
(8) What can you conclude from your results? 
(© Is your answer to (6) consistent with the resuit.of the determinantal test for Prob. 3? 


Bwns 


318 Part Four Optimization Problems 


11.5 Second-Order Conditions 
in Relation to Concavity and Convexity 





Second-order conditions —whether stated in terms of the principal minors of the Hessian 
determinant or the characteristic roots of the Hessian matrix—are always concerned with 
the question of whether a stationary point is the peak of a hill or the bottom of a valley. In 
other words, they relate to how a curve, surface, or hypersurface (as the case may be) bends 
itself around a stationary point. In the single-choice-variabie case, with = = f(x), the hill 
(valley) configuration is manifest in an inverse (U-shaped) curve. For the two-variable 
function z = f(x,y), the hill (valley) configuration takes the form of a dome-shaped 
(bowl-shaped) surface, as illustrated in Fig. 11.2¢ (Fig. 11.26). When threc or more choice 
variables are present, the hills and valleys are no longer graphable, but we may nevertheless 
think of “hills” and “valleys” on hypersurfaces. 

A function that gives risc to a hill (valley) over the entire domain is said to be a concave 
(convex) function." For the present discussion, we shall take the domain to be the entire 2”, 
where n is the mumber of choice variables. Inasmuch as the hill and valley characterizations 
refer to the entire domain, concavity and convexity are, of course, global concepts. Fora finer 
classification, we may also distinguish between concavity and convexity on the one hand, and 
strict concavity and siricr convexity on the other hand. In the nonstrict case, the hill or valley 
is allowed to contain one or more flat (as against curved) portions, such as line segments (on 
acurve) or plane scements (on a surface). The presence of the word strict, however, rules out 
such line or plane scements. The two surfaces shown in Fig. 11.2 illustrate strictly concave 
and strictly convex functions, respectively. The curve in Fig. 6.5, on the other hand, is convex 
(it shows a valley) but not strictly convex (it contains line segments), A strictly concave 
(strictly convex) function must be concave (convex), but the converse is not true. 

In view of the association of concavity and strict concavity with a global hill configura- 
tion, an extremum of a concave function must be a peak—a maximum (as against 
minimum). Moreover, that maximum must be an absolute maximum (as against relative 
maximum), since the hill covers the entire domain. However, that absolute maximum may 
not be unique, because multiple maxima may occur if the hill contains a flat horizontal top. 
The latter possibility can be dismissed only when we specify strict concavity. For only then 
will the peak consist of a single point and the absolute maximum be unique. A unique 
(nonunique) absolute maximum is also referred to as a strong (weak) absolute maximum. 

By analogous reasoning, an extremum of a convex function must be an absolute (or 
global) minimum, which may not be unique. But an extremum of a stricify convex function 
must be a unique absolute minimum. 

In the preceding paragraphs, the properties of concavity and convexity are taken to be 
global in scope. If they are valid only for a portion of the curve or surface (only on a sub- 
set Sof the domain), then the associated maximum and minimum are relative (or local) to 
that subset of the domain, since we cannot be certain of the situation outside of subset S. In 
our éarlier discussion of the sign definiteness of d?z (or of the Hessian matrix H), we eval- 
uated the leading principal minors of the Hessian determinant only at the stationary point. 
By thus limiting the verification of the hill or valley configuration to a small neighborhood 
of the stationary point, we could discuss only re/ative maxima and minima. But it may 


* if the hill (valley) pertains only to a subset S of the domain, the function is said to be concave 
(convex) on S. 


FIGURE 11.5 


Chapter 11 The Case of More than One Chaice Variable 319 


happen that dz has a definite sign everywhere, regardless of where the leading principal 
minors are evaluated. In that event, the hill or valley would cover the entire domain, and the 
maximum or minimum found would be absolute in nature. More specifically, if a?z is 
everywhere negative (positive} semidefinite, the function z = f(x), x3, -,..X,) must be 
concave (convex), and if d?2 is everywhere negative (positive) definite, the function f must 
be strictly concave (strictly convex). 

The preceding discussion is summarized in Fig. 11.5 for a twice continuously differen- 
tiable function z = f(x, x2, ..., xy). For clarity, we concentrate exclusively on concavity 
and maximum; however, the relationships depicted will remain valid if the words concave. 
negative, and maximum are replaced, respectively, by convex, positive, and minimum. To read 











is a stationary point 4 
B,. [first-order condition] 3g 












#2 is negative 
semidefinite at 2* 
{secorid-order 
necessary condition) 


zis negative 
a definite at z* 

.. [second-order sufficient 
4 condition] 
























fis strictly“ 
concave 









isan 
absolute maximum 














cis a unique 
absolute maximum 7 








negative 


negative a 
definite 


esemidefinite 





320 Part Four Optimization Problems 


Fig. 11.5, recall that the => symbol (here clongated and even bent) means “implies.” When 
that symbol extends from one enclosure (say, a rectangle) to another (say, an oval), it means 
that the former implics (is sufficient for) the latter; it also means that the latler is necessary 
for the former. And when the = symbol extends from one enclosure through a second toa 
third, it means that the first enclosure, when accompanied by the second, implics the third. 

In this light, the middle column in Fig. 11.5, read from top to bottom, states that the first- 
order condition is necessary for z* to be a relative maximum, and the relative-maximum 
status of z* is, in turn, necessary for z* to be an absolute maximum, and so on. Alterna- 
tively, reading that column from bottom to top, we sce that the fact that z* is a unique 
absolute maximum is sufficient to establish z* as an absolute maximum, and the absolute- 
maximum status of z* is, in turn, sufficient for z* to be a relative maximum, and so forth. 
The three ovals at the top have to do with the first- and second-order conditions at the sta- 
tionary point z*. Hence they relate only to a relative maximum. The diamonds and triangles 
in the lower part, on the other hand, describe global properties that enable us to draw con- 
clusions about an absolute maximum. Note that while our earlier discussion indicated only 
that the everywhere negative semidefiniteness of dz is sufficient for the concavity of func- 
tion f, we have added in Fig. 11-5 the information thal the condition is necessary. too, In 
contrast, the stronger property of everywhere negative definiteness of d?z {5 sufficient, but 
not necessary, for the strict concavity of F—because strict concavity of fis compatible with 
a zero value of d?z at a stationary point. 

The most important message conveyed by Fig. 11.5, however, lies in the two extended > 
symbols passing through the two diamonds. The one on the lett states that, given a concave 
objective function, any stationary point can immediately be identified as an absolute max- 
imum. Proceeding a step further, we see that the one on the right indicates that if the 
objective function is séricly concave, the stationary point must in fact be a unique absolute 
maximum. In either case, once the first-order condition is met, concavity or strict concay- 
ity effectively replaces the second-order condition as a sufficient condition for maximum - 
nay, for an absolute maximum. The powerfulness of this new sulficient condition becomes 
clear when we recall that dz can happen to be zero at a peak, causing the second-order 
sufficient condition to fail. Concavity or strict concavity, however, can take care of even 
such troublesome peaks, because it guarantees that a higher-order sufficient condition is 
satisfied even if the second-order one is not. It is for this reason that economists often as- 
sume concavity from the very outset when a maximization model is to be formulated with 
a general objective function (and, similarly, convexity is often assumed for a minimization 
model). For then all one necds to do is to apply the first-order condition. Note, however, that 
if a specific objective function is used, the property of concavity ar convexity can no longer 
simply be assumed. Rather, it must be checked, 





Checking Concavity and Convexity 

Concavity and convexity, strict or nonstrict, can be defined (and checked) in several ways. 
We shall first introduce a geometric definition of concavity and convexity for a two-variable 
function z = f(x), ¥2), similar to the one-variable version discussed in Sec. 9.3: 


The function z = (x1, x2) is concave (convex) iff, for any pair of distinct points Mf and Non 
its graph—a surface-- line segment MN lies cither on or below (above) the surface. The fune- 
tion is strictly concave (siricily convex) iff line segment MN lies entirely befow (above) the 
surface, except at and N. 


FIGURE 11.6 


Chapter 11 fhe Case of More than One Choice Fariable 321 









ry 
1 
‘ 





ne 





4 
(4, te) 





The case of a strictly concave function is illustrated in Fig, 11.6, where Mand N, two arbi- 
trary points on the surface, are joined together by a broken line segment as well as a solid 
arc, with the latter consisting of points on the surface that lie directly above the linc seg- 
ment. Since strict concavity requires line segment 4N to lie entirely below arc MN (except 
at M and NV’) for any pair of points Af and N, the surface must typically be dome-shaped. 
Analogously, the surface of a strictly convex function must typically be bowl-shaped. As 
for (nonstrictly) concave and convex functions, since line segment MN is allowed to lie on 
the surface itself, some portion of the surface, or even the entire surface, may be a plane— 
flat, rather than curved. 

To facilitate generalization to the nongraphable n-dimensional case, the geometric defi- 
nition needs to be translated into an equivalent algebraic version. Returning to Fig. 11.6, let 
te = (1, w2) and v = (1, v2) be any two distinct ordered pairs (2-vectors) in the domain 
of z = f(xt, x2). Then the z values (height of surface) corresponding to these will be 
F(u) = f(4i, uz) and f(v) = f(v, v2), respectively, We have assumed that the variables 
can take all real values, so if u and v are in the domain, then all the points on the line seg- 
ment uv are also in the domain, Now each point on the said line segment is in the nature of 
a “weighted average” of u and uv. Thus we can denote this linc segment by @v + (1 —0)v, 
where 6 (the Greek letter theta)—unlike « and v—is a (variable) scalar with the range of 
values 0 < 0 <1.' By the same token, line segment MN, representing the set of all 
weighted averages of f(u) and f(v), can be expressed by 6/(u) +(1 — 6) f(v), with 0 
again varying from 9 to 1. What about are MN along the surface? Since that arc shows the 





t The weighted-average expression 8u + (1 — @}y, for any specific value of @ between 0 and 1, is 
technically known as a convex combination of the two vectors u and v. Leaving a more detailed 
explanation of this to a later paint of this section, we may note here that when # = 0, the given 
expression reduces to vector v and similarly that when # = 1, the expression reduces to vector u. An 
intermediate value of #, on the other hand, gives us an average of the two vectors u and v. 


322 Part Four = Qptinrization Problems 


values of the function / evaluated at the various points on line segment wv, it can be written 
simply as f[@u + (1 — @)u]. Using these expressions, we may now state the following 
algebraic definition: 


concave 


A funetion fis | 
. COTIVEX, 


iff, for any pair of distinct points w and v in the domain of f, and 


forQ<@ <1, 
af +U - Ase) = fife + (1 — ye] (11.20) 
————_— —— FE ERS aed 
height of ling segment height of are 


Note that, in order to exclude the two end points Mf and N from the height comparison, we 
have restricted @ to the open interval (0, 1) only. 

This definition is casily adaptable to strict concavity and convexity by changing the 
weak inequalities < and > to the strict inequalities < and >, respectively. The advantage of 
the algebraic definition is that it can be applied to a function of any number of variables, for 
the vectors # and v in the definition can very well be interpreted as n-vectors instead of 
2-vectors. 

From (11.20), the following three theorems on concavity and convexity can be deduced 
fairly easily. These will be stated in terms of functions f(x) and g(x}, but x can be inter- 
preted as a vector of variables; that is, the theorems are valid for functions of any number 
of variables. 


Theorem! (linear function) If f(x) is a linear function, then it is a concave function as 
well as a convex function, but nat strictly so. 


Theorem IY (negative of a function) If f(x) is a concave function, then - f(x) is a 
convex function, and vice versa. Similarly, if f(x) is a strictly concave function, then 
— f(x) isa strictly convex function, and vice versa. 


Theorem II] (sum of functions) If f(x} and g(x) are both concave (convex) functions, 
then f(x) + g(x) is also a concave (convex) function, If f(r) and g(x) arc both concave 
(convex) and, in addition, either one or bath of them are strictly concave (Strictly convex), 
then f(x) + g(x) is strictly concave (strictly convex). 


Theorem I follows from the fact that a linear function plots as a straight line, plane, or hy- 
perplane, so that “line segment MN” always coincides with “arc MN.” Consequently, the 
equality part of the two weak inequalities in (11.20) are simultaneously satisfied, making the 
function qualify as both concave and convex. However, since it fails the strict-inequality 
part of the definition, the linear function is neither strictly concave nor strictly convex. 

Underlying Theorem Il is the fact that the definitions of concavity and convexity differ 
only in the sense of inequality. Suppose that f(x) is concave; then 


Of(u) +0 - A) fv) = flOu+ 0 — Oe] 
Multiplying through by —1, and duly reversing the sense of the inequality, we get 
flay} + 1 - OL - f(e)] 2 — flew + C= 82] 


This, however, is precisely the condition for -/(x) to be convex. Thus the theorem 
is proved for the concave f(x) case. The geometric interpretation of this result is very 


Example 1 


Chapter 11 The Case of More than One Chuice Variahle 323 


simple: the mirror image of a hill with reference to the base plane or hyperplane isa valley. 
The opposite case can be proved similarly. 

To see the reason behind Theorem IJ], suppose that f(x) and g(x) are both concave. 
Then the following two inequalities hold: 


Of) +1 -@) f(v) < flee + —6)v] (11.21) 
dg(u) + (1 - giv) < g[6u +{1 - Av] (11,22) 


Adding these, we obtain a new inequality 


Alfa) + 900] + (1 - OLA(e) + gl] 
< fleut (leu) + glut (1—A)o} (1.23) 


But this is precisely the condilion for [ f(x) + g(x)] to be concave. Thus the theorem is 
proved for the concave case. The proof for the convex case is similar. 

Moving to the second part of Theorem TI, let f(x) be strictly concave. Then (1 1.21) be- 
comes a strict inequality: 


Of (u) + (1 — OV fe) < flOw + (1 — We] (11.21) 


Adding this to (11.22), we find the sum of the left-side expressions in these two inequali- 
ties to be striczly less than the sum of the right-side expressions, regardless of whether 
the < sign or the = sign holds in (11.22). This means that (11.23) now becomes a strict 
inequality, too, thereby making [ f(x) + g(x)] strictly concave. Besides, the same conclu- 
sion cmerges a fortiori, if g(x) is made strictly concave along with f(x), that is, if (11.22) 
1s converted into a strict inequality along with (11,21), This proves the second part of the 
theorem for the concave case. The proof for the convex case is similar. 

This theorem, which is also valid for a sum of more than two concave (convex) func- 
tions, may prove usefull sometimes because it makes possible the compartmentalization 
of the task of checking concavity or convexity of a function that consists of additive terms. 
If the additive terms are found to be individually concave (convex), that would be sufficient 
for the sum function to be concave (convex). 


Check z= x + es for concavity or convexity. To apply (11.20), let u = (ui, uz) and 
Vv = (v1, ¥2) be any two distinct points in the domain. Then we have 

f(u) = Fur, uz) = uF + 

Fv) = Fu =v +3 


and f[gu+ 0 -Ov]= [ou +(1~ Oy, bug +(1 -| 
ee 
value of a1 value of x) 


= [PH + -@MP + [Ge + - Oa? 


Substituting these into (11.20), subtracting the right-side expression from the left-side one, 
and collecting terms, we find their difference to be 


0 = OG + 4) +001 - a(¥ + ¥) = 20(1 = BYurvy + upv2) 
= 001 = O(a — uP +2 — ve") 


324 Part Four Oprimization Problems 


Example 2 


Example 3 


Since @ is a positive fraction, (1 — #) must be pasitive. Moreover, since (tt, Uz) and (v1, v2) 
are distinct points, so that either uy 4) or u2 # v2 (or both), the bracketed expression 
must also be positive. Thus the strict > inequality holds in (11.20), and Z= a? + a is strictly 
convex. 

Alternatively, we may check the x and 3 terms separately. Since each of them is indi- 
vidually strictly convex, their sum is also strictly convex. 

Because this function is strictly convex, it possesses a unique absclute minimum. It is easy 
to verify that the said minimum is z* = Q, attained at xf = x} = 0, and that it is indeed 
absolute and unique because any ordered pair (x1, x2) # (0, 0) yields a z value greater 
than zero. 


Check z= —x? — x} for concavity or convexity. This function is the negative of the function 
in Example 1. Thus, by Theorem Il, it is strictly concave. 


Check z= (x + y)? for concavity or convexity. Even though the variables are denoted by x 
and y instead of x; and x2, we can stifl fet u = (uy, ug) and v = (vj, v2) denote two distinct 
points in the domain, with the subscript / referring to the ith variable, Then we have 


f(y) = Fn th) = t+ hy 
fy) = fa, w= (nm +P 
and f[eu+ (1 —@)v] = [Ou +1 —0)) + Og + (1 OP 
= [Bn +) +0 - 8) + IP 


Substituting these into (11.20), subtracting the right-side expression from the left-side one, 
and simplifying, we find their difference to be 


OC] — @)fuy + uz)? — 261 — 8)(uy + u2)(¥y + v2) + BCL — A) + V2)? 
=00 —A)(ar +a) — (+ 2) FP 


As in Example 1, 8(1 — @) is positive. The square of the bracketed expression is nonnegative 
(zero cannot be ruled out this time). Thus the = inequality holds in (11,20), and the func- 
tion (x + y}? is convex, though not strictly so. 

Accordingly, this function has an absolute minimum that may not be unique, It is easy to 
verify that the absolute minimum is z* = 0, attained whenever x* + y* = 0, That this is an 
absolute minimum is clear from the fact that whenever x + y #0, z will be greater than 
2* = 0. That it is not unique follows from the fact that an infinite number of (x*, y*), pairs 
can satisfy the condition x* + y* =0. 


Differentiable Functions 
As stated in (11.20). the definition of concavity and convexity uses no derivatives and thus 
docs not require differentiability. If the function is differentiable, however, concavity and 
convexity can also be defined in terms of its first derivatives. In the one-variable case, the 
definition is: 

concave 


A differentiable function f(x) is | 
. CONVEX, 


| iff, for any given point w and any other point v 
in the domain, 


fv) 





<| flu) + fol — 0) (11.24) 


FIGURE 11.7 


Chapter 11 The Case of More than One Choice Variable 325 


fy 


Fon) 








oO 





Concavity and convexity will be strict, if he weak inequalities in (11.24) are replaced by 
the strict inequalities < and >, respectively. Interpreted geometrically, this definition de- 
picts a concave (convex) curve as one that lies on or below (above) all its tangent lines. To 
qualify as a strictly concave (strictly convex) curve, on the other hand, the curve must lie 
strictly below (above) all the tangent lines. except at the points of tangency. 

In Fig. 11.7, let point 4 be any given point on the curve, with height (a) and with tan- 
gent line AB. Let x increase from the value v. Then a strictly concave curve (as drawn) 
must, in order to form a hill, curl progressively away from the tangent line AB, so that point 
C, with height /(v), has to lie below point 2. In this case, the slope of line segment AC is 
ess than that of tangent 48. If the curve is xeustrietly concave, on the other hand, it may 
contain a line segment, so that, for instance, arc AC may turn into a line segment and be co- 
incident with line segment AB, as a linear portion of the curve, In the laller case the slope 
of AC is equal to that of AB. Together, these two situations imply that 


-) fle) ~ flu) 


Du 





DC 
(stone of line segment AC = ——— < (slope of AB =) f’(u) 


AD 
When multiplied through by the positive quantity (v — w), this inequality yields the result 
in (11.24) for the concave function. The same result can be obtaincd, if we consider instead 
x Values less than u. 
When there are two or more independent variables, the definition needs a slight 
modification: 


A differentiable function f(x} = f(x, .. iff, for any given point 


ie 


.. | concave 
convex 
ue = (i,.... 42) and any other point v = (v),..., u,) in the domain, 


Fv) i] f+ SHON, - a) (11.24) 
2 & 


where fi(u) = 4//0x; is evaluated at y= (1, 0.14 ta). 





This definition requires the graph of a concave (convex) function f(x) to lig on or below 
(above) all its tangent planes or hyperplanes. For strict concavity and convexity, the weak 


326 Part Four Optimization Problems 


Example 4 


inequalitics in (11.24') should be changed to strict inequalities, which would require the 
graph of a strictly concave (strictly convex) function to lie strictly below {above) all its tan- 
gent planes or hyperplanes, except at the points of tangency. 

Finally, consider a function z = f(x1,...,.%) which is twice continuously differen- 
tiable. For such a function, second-order partial derivatives exist, and thus dz is defined. 
Concavity and convexity can then be checked by the sign of d?z: 


concave | . 


A twice continuously differentiable function z = f(x), ..-.2 xX,) is | | if, and only 
convex 


if d?z is everywhere { negative semidefinite. The said function is sity | coneane | if 
positive convex 
spy lee , negative 
(but not only if) d°z is everywhere | positive | definite, (11.25) 


You will recall that the concave and strictly concave aspects of (11.25) have already been 
incorporated into Fig. 11.5. 


Check z= —x4 for concavity or convexity by the derivative conditions. We first apply 
(11.24). The left- and right-side expressions in that inequality are in the present case —¥* 
and —u* — 4u3(v — u), respectively. Subtracting the latter from the former, we find their 
difference to be 





av put 4 dur(y —u) = (v- 4) (- + 4) [factoring] 


vou 


ava + Put ve tu) +4u4] [by (7.23) 


It would be nice if the bracketed expression turned out to be divisible by (v —u), for then 
we could again factor out (v — u) and obtain a squared term (v — u)? to facilitate the eval- 
uation of sign. As it turns out, this is indeed the case. Thus the preceding difference equa- 
tion can be written as 


=(v —u2[v? + 2vut 34] = -(v — vt uy? + 207] 


Given that v # u, the sign of this expression must be negative. With the strict < inequality 
holding in (11.24), the function z= —x4 is strictly concave. This means that it has a unique 
absolute maximum. As can be easily verified, that maximum is z* = 0, attained at x* = 0. 

Because this function is twice continuously differentiable, we may also apply (11.25). 
Since there is only one variable, (11.25) gives us 


z= F(x) dx? = 12x? dx? [by (11.2)] 


We know that dx? is positive (only nonzero changes in x are being considered); but 
~12x? can be either negative or zero. Thus the best we can da is to conclude that Pz 
is everywhere negative semidefinite, and that z= —x‘ is (nonstrictly) concave. This con- 
clusion from (11.25) is obviously weaker than the one obtained earlier from (11.24); 
namely, 2= —x* is strictly concave. What limits us to the weaker conclusion in this case is 
the same culprit that causes the second-derivative test to fail on occasions—the fact that az 
may take a zero value at a stationary point of a function known to be strictly concave, or 
strictly convex. This is why, of course, the negative (positive) definiteness of zis presented 
in (11.25) as only a sufficient, but not necessary, condition for strict concavity (strict 
convexity). 


Example 5 


FIGURE 11.8 


Chapter 11 The Case of More than One Choice Variable 327 


Check z= xf +x} for concavity or convexity by the derivative conditions, This time we 
have to use (11.24') instead of (11.24). With u = (un, uz) and v = (vj, v2) as any two points 
in the domain, the two sides of (11.24') are 
Left side = v} + v3 
Right side = uf + uj A 2uy (4 — uy) + 2u2(v2 - U2) 





Subtracting the latter from the former, and simplifying, we can express their difference as 
Ve Quay +f +3 = 2vouy — ub = (vr — en}? + (v2 — ua)? 


Given that (v1, v2) # (ui, 2), this difference is always positive. Thus the strict > inequality 
holds in (11.249, and z= x? + x2 is strictly convex. Note that the present result merely 
reaffirms what we have previously found in Example 1. 

As for the use of (11.25), since f, = 2x1, and fe = 2x2, we have 


20 


fr fie 
~{Q 2 


fi=2>0 and fo tan 








|-4>0 


regardless of where the second-order partial derivatives are evaluated. Thus d?z is every- 
where positive definite, which duly satisfies the sufficient condition for strict convexity. In 
the present example, therefore, (11.24'} and (11.25) do yleld the same conclusion. 


Convex Functions versus Convex Sets 

Having clarified the meaning of the adjective convex as applied to a function, we must has- 
ten to explain its meaning when used to describe a sez. Although convex sels and convex 
functions are not unrelated, they are distinct concepts. and it is important not to confuse 
them. 

For easier intuitive grasp, let us begin with the geometric characterization of a convex 
set. Let S be a set of points in a 2-space or 3-space. If, for any two points in set S. the Hine 
segment connecting these two points lies entirely in S, then § is said to be a convex set. It 
should be obvious that a straight line satisfies this definition and constitutes a convex set. 
By convention, a set consisting of a single point is also considered as a convex set, and so 
is the null set (with no point). For additional examples, let us look at Fig. 11.8. The disk 
namely, the “solid” circle, a circle plus all the points within it is a convex set, because a 
line joining any two points in the disk lics entirely in the disk, as exemplified by ab (link- 
ing two boundary points) and ed (linking two interior points), Note, however, that a 


328 Part Four Optimization Problems 


FIGURE 11.9 


(ery, 3) 








Oo 1 


(hollow} cirele is nor in itself a convex set. Similarly, a triangle, or a pentagon, is not in it- 
self a convex set, but its solid version is. The remaining two solid figures in Fig. 11.8 are 
not convex sets, The palette-shaped figure is reentrant (indented); thus a line segment such 
as gh does not lie entirely in the set. In the key-shaped figure, moreover, we find nol only 
the feature of reentrance, but also the presence of a hole, which is yet another cause of non- 
convexity. Generally speaking, to qualify as a convex sct, the set of points must contain no 
holes, and its boundary must not be indented anywhere. 

The geometric definition of convexity also applies readily to point sets in a 3-space. For 
instance, a solid cube is a convex set, whereas a hollow cylinder is not. When a 4-space or 
a space of higher dimension is involved, however, the geometric interpretation becomes 
less obvious, We then need to turn to the algebraic definition of convex sets. 

To this end, it is useful to introduce the concept of convex combination of vectors 
(points), which is a special type of linear combination. A linear combination of two vectors 
wand v can be written as 

kiu thy 


where k) and ky are two scalars. When these two scalars both lic in the closed interval [0, 1] 
and add up to unity, the linear combination is said to be a convex combination, and can be 
expressed as 

éut ay = (0<e<1) (11.26) 


1 2 
As an illustration, the combination 3 ff + 3 [*] is a convex combination, In view of 


the fact that these two scalar multipliers are positive fractions adding up to 1, such a con- 
vex combination may be interpreted as a weighted average of the two vectors.! 

The unique characteristic of the combination in (11.26) is that, for every acceptable 
value of 6, the resulting sum vector lies on the linc segment connecting the points « and v. 
This can be demonstrated by means of Fig. 11.9, where we have plotted two vectors 


a= FE | andy = [ , | as two points with coordinates (wv), #2) and {v, v2), respectively. 
2 2 


1 This interpretation has been made use of earlier in the discussion of concave and convex functions. 


Chapter 11 The Case of More than One Choice Variable 329 


If we plot another vector g such that Oquv forms a parallelogram, then we have (by virtue 
of the discussion in Fig, 4,3) 


u=qgty or g=u-v 


It follows that a convex combination of vectors u and v (let us call it ww} can be expressed in 
terms of vector qh because 


w=lut+(l—@)y=@ut+u—@vs0u—v) tvseq+u 


Hence, to plot the vector w, we can simply add @g and v by the familiar parallelogram 
method. If the scalar @ is a positive fraction, the vector @g will merely be an abridged ver- 
sion of vector g; thus #g must lie on the linc segment Og. Adding 0q and v, therefore, we 
must find vector w tying on the line segment we, for the new, smaller parallelogram is noth- 
ing but the original parallelogram with the qu side shifted downward. The exact location of 
vector w will, of course, vary according to the valuc of the scalar 6; by varying @ from zero 
to unity, the location of w will shift from v to «. Thus the set of all points on the line seg- 
ment xv, including u and uv themselves, corresponds to the set of all convex combinations 
of vectors u and v. 

In view of the preceding, a convex set may now be redefincd as follows: A set Sis con- 
vex ifand only if, for any two points # € S and v € S, and for every scalar 6 € [0, 1], it is 
true that w = @u + (1 —@)u € S. Because this definition is algcbraic, it is applicable re- 
gardless of the dimension of the space in which the vectors u and v are located. Comparing 
this definition of a convex set with that of a convex function in (11.20), we see that even 
though the same adjective convex is used in both, the meaning of this word changes radi- 
cally from one context to the other. In describing a fiction, the word convex specifies how 
a curve or surface bends itself—it must form a valley. Bul in describing a set, the word 
specifies how the points in the sct are “packed” together-- they must not allow any holes to 
arise, and the boundary must not be indented. Thus convex functions and convex sets are 
clearly distinet mathematical entities, 

Yet convex functions and convex scts are not unrelated. For one thing, in defining a con- 
vex function, we need a convex set for the domain, This is because the definition (11.20) 
requires that, for any two points w and v in the domain, all the convex combinations of w 
and u --specifically, @u + (1 — @)v,0 = 8 < 1—must also be in the domain, which is, of 
course, just another way of saying that the domain must be a convex set. To satisfy this re- 
quirement, we adopted earlier the rather strong assumption that the domain consists of the 
entire 7-space (where 2 is the number of choice variables), which is indeed a convex sct. 
However, with the concept of convex sets at our dispasal, we can now substantially weaken 
that assumption. For all we need to assume is that the domain is a convex subset of 2", 
rather than R” itself. 

There is yet another way in which convex functions are related to convex sets. If f(x) is 
a convex function, then for any constant &, it can give rise to a convex set 


S=lxlfa)<h (f(x) convex} (11.27) 











This is illustrated in Fig. 11.10 for the one-variable case. The set $2 consists of all the 
x values associated with the segment of the f(x) curve lying on or below the broken hori- 
zontal line. Hence it is the line segment on the horizontal axis marked by the heavy dots, 


330 Part Four Optimization Problems 


FIGURE 11.10 


fx) a) 
fay 





Set s™ 
(a) (2) 





which is a convex set. Note that if the & value is changed, the S* set will become a differ- 
ent line segment on the horizontal axis, but it will still be a convex set. 

Going a step further, we may observe that even a concave function is related to convex 
sets in ways simitar, First, the definition ofa concave function in (11,20) is, like the convex- 
function case, predicated upon a domain that is a convex set. Moreover, even a concave 
function—say, g(x)—-can generate an associated convex set, given some constant &. That 
convex set is 

S* = {x | g(x) = kh} [g(x) concave] (11.28) 
in which the > sign appears instead of <. Gcomctrically, as shown in Fig. 11.10 for the 
one-variable case, the set S* contains all the x values corresponding to the segment of the 
g(x) curve lying on or above the broken horizontal line. Thus it is again a line segment on 
the horizontal axis- -a convex set, 

Although Fig. 11.10 specifically illustrates the one-variable case, the definitions of S* 
and $* in (11.27) and (11.28) are not limited to functions of a single variable, They are 
equally valid if we interpret x to be a vector, Le., let x = (%1,....4 x,). In that case, however, 
(11.27) and (11.28) will define convex sets in the #-space instead, It is important to re- 
member that while a convex function implies (1].27), and a concave function implies 
(11.28), the converse is not true —for (11.27) can also be satisfied by a nonconvex function 
and (11.28) by a nonconcave function. This is discussed further in Sec. 12.4. 





EXERCISE 11,5 


1. Use (11.20) to check whether the following functions are cancave, convex, strictly con- 
cave, strictly convex, or neither: 
(a) z= x? (b) z= x2 + 2x} (Q z= 2x? -xyty? 

2. Use (11.24) or (11.24") to check whether the following functions are concave, convex, 
Strictly concave, strictly convex, or neither: 
(a) z=-%? (b) z= (1 — x2)? () z= Ky 

3. In view of your answer to Prob. 2c, could you have made use of Theorem III of this 
section to compartmentalize the task of checking the function z= 2x? - xy+ y? in 
Prob. 1c? Explain your answer. 


Chapter 11 The Case of More than Ore Choice Variable 331 


4. Do the following constitute convex sets in the 3-space? 
(a) A doughnut (b) A bowling pin (Q) A perfect marble 
5. The equation x + y? = 4 represents a circle with center at (0, 0) and with a radlus of 2. 
(a) Interpret geometrically the set ((x, y) |x? + y? <4]. 
(b) Is this set convex? 
6. Graph each of the following sets, and indicate whether it is convex: 
0) Kx ty=er} (1G, Aly s 13-27} 
(Bx Vly 2 e*} (d) (Gu) aye line Oy> O 


7, Given v= i and y= [a , which of the following are convex combinations of u 
and v? 


“f] wf] alt 


8. Given two vectors u and v in the 2-space, find and sketch: 
(a) The set of all linear combinations of u and v. 
(b) The set of ail nonnegative linear combinations of uw and v. 
(¢) The set of all convex combinations of u and-y, 
9. (a) Rewrite (11.27) and.(11.28) specifically for the cases where the f and g functions 
have-n independent variables. 


(b) Let n= 2, and let the function fbe shaped like a (vertically held) ice-cream cone 
whereas the function g is shaped like.a pyramid. Describe'the sets SS and $=. 


11.6 Economic Applications 





Example 1 


At the beginaing of this chapter, the case of a multiproduct firm was cited as an illustration 
of the general problem of optimization with more than one choice variable. We are now 
equipped to handle that problem and others of a similar nature. 


Problem of a Multiproduct Firm 


Let us first postulate a two-product firm under circumstances of pure competition. Since 
with pure competition the prices of both commodities must be taken as exogenous, 
these will be denoted by Pig and P29, respectively. Accordingly, the firm’s revenue function 
will be 


Ry = Pig Qh + Poa Qe 


where Q; represents the output level of the ith product per unit of time. The firm’s cost 
function is assumed to be 


C=24 + Q1Q+295 


Note that 8C/4Q; = 4Q) + Q2 (the marginal cost of the first product) is a function not 
only of Q; but also of Qz. Similarly, the marginal cost of the second product also depends, 
in part, on the output level of the first product. Thus, according to the assumed cost func- 
tion, the two commodities are seen to be technically related in production. 

The profit function of this hypothetical firm can now be written readily as 


m= R-C= Pin Qi + PQ — 27 — GQ - 29 


332 Part Four Qprimization Problems 


Example 2 


a function of two choice variables (Q; and Q2) and two price parameters. It is our task to 
find the levels of Q) and Q» which, in combination, will maximize x. For this purpose, we 
first find the first-order partial derivatives of the profit function: 


n(= gq) = Pen 4a 
: (11.29) 


Setting both equal to zero, to satisfy the necessary condition for a maximum, we get the 
two simultaneous equations 





4Qi + Q = Pro 
Q + 4Q2 = Pro 
which yield the unique solution 
4P; Pog — Pi 
a= ma and = ae 10 





Thus, if P}9 = 12 and P29 = 18, for example, we have Q} = 2 and Q3 = 4, implying an 
optimal profit x” = 48 per unit of time. 

To be sure that this does represent a maximum profit, let us check the second-order con- 
dition. The second partial derivatives, obtainable by partial differentiation of (11.29), give 
us the following Hessian: 














_ im m2}_|—-4 -1 
HI = my mm} |-1 -4 
Since |H}| = ~4 < 0 and |H| = 15 > 0, the Hessian matrix (or dz) is negative definite, 


and the solution does maximize the profit. In fact, since the signs of the leading principal 
minors do not depend on where they are evaluated, d?z is in this case everywhere negative 
definite. Thus, according to (11.25), the objective function must be strictly concave, and 
the maximum profit just found is actually a unique absolute maximum. 


Let us now transplant the problem of Example 1 into the setting of a monopolistic market. 
By virtue of this new market-structure assumption, the revenue function must be modified 
to reflect the fact that the prices of the two products will now vary with their output levels 
(which are assumed to be identical with their sales levels, no inventory accumulation being 
contemplated in the model). The exact manner in which prices will vary with output levels 
is, of course, to be found in the demand functions for the firm‘s two products. 

Suppose that the demands facing the monopolist firm are as follows: 


Q =40—2P) + Po 
Q2= 15+ Py P2 


These equations reveal that the two commodities are related in consumption; specifically, 
they are substitute goods, because an increase in the price of one will raise the demand 
for the other. As given, (11.30) expresses the quantities demanded Q) and Q2 as functions 
of prices, but for our present purposes it will be more convenient to have prices Py and Po 
expressed in terms of the sales volumes Q; and Qo, that is, to have average-revenue func- 
tions for the two products, Since (11.30) can be rewritten as 


—2P, + Py = Q, -40 
Pi - Pp = Q2—-15 


(11.30) 


Chapter 11 The Case of More than One Choice Variable 333 


we may (considering Qy and Q2 as parameters) apply Cramer's rule to solve for P; and P2 
as follows: 


Pr=55-Qi~ Q 
Py =70- Qi -2Q) 


These constitute the desired average-revenue functions, since P; = AR; and P2 = AR2. 
Consequently, the firm’s total-revenue function can be written as 


R= PyQi + P2Q2 
= (55 — Q — Q2)Qi +(70— Q; — 2Q2)Q {by (11.30')] 
= 55Q1 +70Q2 = 201 Q - Y- 2G 
If we again assume the total-cost function to be 


C=G4+ Aer G 


(11.30) 


then the profit function will be 
m= R-C=55Q1 + 20Q2 — 301 Q2 — 2QF - 3Q5 (11,31) 


which is an objective function with two choice variables. Once the profit-maximizing out- 
put levels Qj and Q} are found, however, the optimal prices P* and P} are easy enough to 
find from (11.30'). 

The objective function yields the following first and second partial derivatives: 


my =33—3Q2-4Q1 72 = 70-32, -6Q2 
ans-4 0 msm =-3 ay =-6 

To satisfy the first-order condition for a maximum of x, we must have x) = x = 0; that is, 

4Q,+3Q)=55 

3Q) +6Q2 = 70 
Thus the solution output levels {per unit of time) are 

(Qi, @) = (8,73) 
Upon substitution of this result into (11.30) and (11.31), respectively, we find that 
Pp=39f  P3=462 and = x* = 4881 (per unit of time) 


Inasmuch as the Hessian is 





4 3 
—3 -6)' 
IMl=-4<0 and |H:|=15>0 
so that the value of * does represent the maximum profit. Here, the signs of the leading 
principal minors are again independent of where they are evaluated. Thus the Hessian ma- 


trix is everywhere negative definite, implying that the objective function is strictly concave 
and that it has a unique absolute maximum. 


Price Discrimination 

Even in a single-product firm, there can arise an optimization problem involving two or 
more choice variables. Such would be the case, for instance, when a monopolistic firm sells 
a single product in two or more separate markets (e.g., domestic and foreign) and therefore 


334 Part Four Optimization Problems 


Example 3 


must decide upon the quantities (Q), Q2, etc.) to be supplied to the respective markets in 
order to maximize profit. The several markets will, in general, have different demand 
conditions, and if demand elasticities differ in the various markets, profit maximization 
will ontail the practice of price discrimination, Let us derive this familiar conclusion 
mathematically. 


For a change of pace, this time let us use three choice variables, i.e., assume three separate 
markets. Also, let us work with general rather than numerical functions. Accordingly, our 
monopolistic firm will simply be assumed to have total-revenue and total-cost functions as 
follows: 


R= Ri(Qi) + Ra Qa) + R3(Qs) 
C=C(Q)) where §=Q=Q4+Q2+Q3 


Note that the symbol R; represents here the revenue function of the ith market, rather 
than a derivative in the sense of f;. Each such revenue function naturally implies a particu- 
lar demand structure, which will generally be different from those prevailing in the other 
two markets. On the cost side, on the other hand, only one cost function is postulated, 
since a single firm is producing for all three markets. In view of the fact that Q = 
Q) + Qo + Qs, total cost Cis also basically a function of Qi, Q2, and Q3, which constitute 
the choice variables of the model. We can, of course, rewrite C(Q) as C(Qy + Qz + Qs). It 
should be noted, however, that even though the latter version contains three independent 
variables, the function should nevertheless be considered as having a single argument only, 
because the sum of Q, is reaily a single entity. In contrast, if the function appears in the form 
CCQ1, Q2, Q3), then there can be counted as many arguments as independent variables. 

Now the profit functian is 


m= Ri(Qi) + Re(Qz) + R3(Qa) — C(Q) 
with first partial derivatives x; = 42/3Q; (for i = 1, 2, 3) as follows:* 


my = RY(QH) = CC) SS = RQ) - CC [snve 2a 


a 1| 
a 
IQ 


mz = Ri(Q2) - C(Q) fo R3(Q2)-C(Q) since = 1] (11.32) 
9Q2 4Q2 


= CQ) C22 = RK) CKQ_frnce Z2 = 1] 
Setting these equal to zero simultaneously will give us 
CCQ) = Ry(Qr) = Ra(Qr) = R3(Q3) 
That is, 
MC = MR, = MR> = MR3 
Thus the levels of Q), Q2, and Q3 should be chosen such that the marginal revenue in each 


market is equated to the marginal cost of the total output Q. 


* Note that, to find C/9 Q,, the chain rule is used: 
ac _ dC aQ 
QQ 4Q, 


Chapter 11 fhe Case of fore than One Choice Variable 335 


To see the implications of this condition with regard to price discrimination, let us first 
find out how the MR in any market is specifically related to the price in that market. Since 
the revenue in each market is R; = ?;Q;, it follows that the marginal revenue must be 


aR; dQ; dP; 
MR, = 22 =p, 2S 4. g, 
B= Fa, = "dq, + % aq 


a0 (ree o) =e (142) Iby (8.4) 


where éq;, the point elasticity of demand in the ith market, is normally negative. Conse- 
quently, the relationship between MR; and P; can be expressed alternatively by the equation 


MR; = P; (\ - i) (11,33) 
leail 

Recall that Jegi/ is, in general, a function of F;, so that when Q* is chosen, and P;* thus spec- 
ified, |e4j| will also assume a specific value, which can be either greater than, or less than, or 
equal to one, But if |¢g;| < 1 (demand being inelastic at a point), then its reciprocal will 
exceed one, and the parenthesized expression in (11.33) will be negative, thereby implying 
a negative value for MR;. Similarly, if |eg;| = 1 (unitary elasticity), then MR; will take a zero 
value. Inasmuch as a firm’s MC is positive, the first-order condition MC = MR; requires 
the firm to operate at a positive level of MR). Hence the firm’s chosen sates levels Q; must 
be such that the corresponding point elasticity of demand in each market is greater than 
one. 

The first-order condition MRy = MR2 = MR; can now be translated, via (11,33), into the 


following: 
1 
n(i-h)=n(1-2)=m(0- 1) 
léar| leazl leas| 


From this it can readily be inferred that the smailer the value of |¢g| (at the chosen level of 
output) in a particular market, the higher the price charged in that market must be—hence, 
price discrimination profit is to be maximized. 

To ensure maximization, let us examine the second-order condition. From (11.32), the 
second partial derivatives are found to be 


aQ 





mi = RI(Qiy- C'(Q) 307 RQ) — CCQ) 
a voy 22 _ an ” 
wta2 = R3(Qa) — C°(Q) 37 R3(Q2) — C"(Qh 
733 = R3(Q3)-C"(Q) 7 = R3(Q3) — C"(Q) 
and W127 = 2) = 13 = AH = 173 = 132 = —C"(Q) sive ie = | 


so that we have (after shortening the second-derivative notation) 


RY _c! | 
idi=| — -c" 
_ mec 








336 Part Four Optimization Problems 


Example 4 


The second-order sufficient condition will thus be duly satisfied, provided we have: 


1, [Hi] = R{ —C" < 0; that is, the slope of MR1 is less than the slope of MC of the entire 
output [cf. the situation of point £ in Fig. 9.6c]. (Since any of the three markets can be 
taken as the “first” market, this in effect also impties 85 — C” < O and R3- C’ <0.) 

2. [Hal = (RY — CPV(RE — C") — (CPP > O; oF, RERS — (RE + RIC" > 0. 

3. [Hg = RURIRY —(RERY + RERE + RERIIC” <0, 

The last two parts of this condition are not as easy to interpret economically as the first. 

Note that had we assumed that the general 2;(Q;) functions are all concave and the gen- 

eral C(Q) function is convex, so that —C(Q) is concave, then the profit function—the sum 

of concave functions—could have been taken to be concave, thereby obviating the need to 
check the second-order condition. 


To make the above example more concrete, let us now give a numerical version. Suppose 
that our monopolistic firm has the specific average-revenue functions 

P) =63-4Q sothat 2} = PQ: = 63Q, - 4Q7 

P) = 105-5Q2 Ry = P2Qz = 108Q2 — 55 

Py = 75-605 Ry = P3Q3 = 75Qs - 6Q5 


and that the total-cost function is 
C= 20415Q 
Then the marginal functions will be 
RY = 63-8Q) RS = 105 — 10Q2 Ry = 75-1203 c=15 
When each marginal revenue &° is set equal to the marginal cost C’ of the total output, the 
equilibrium quantities are found to be 


Qi=6 Q=9 and Q=5 


Thus g=) QG=20 
f=1 
Substituting these solutions inte the revenue and cost equations, we get 2° = 679 as the 
total profit from the triple-market business operation. 
Because this is a specific model, we do have to check the second-order condition (or 
the concavity of the objective function), Since the second derivatives are 


R= -B RS =-10 RY =-12, C0 


all three parts of the second-order sufficient conditions given in Example 3 are duly satisfied. 

It is easy to see from the average-revenue functions that the firm should charge the dis- 
criminatory prices P? = 39, Pj = 60, and Pj = 4S in the three markets. As you can readily 
verify, the point elasticity of dernand is lowest in the second market, in which the highest 
price is charged. 


Input Decisions of a Firm 
Instead of output levels Q;, the choice variables of a firm may also appear in the guise of 
input levels. 


Example 5 


Chapter 11 The Case of More than One Choice Variable 337 


Consider a competitive firm with the following profit function 


m=R-C=PQ-wl-rKk (11,34) 
where P = price 
Q = output 
L = labor 
X =capital 


w, f = input prices for { and K, respectively 


Since the firm operates in a competitive market, the exogenous variables are P, w, and r 
(written here without the zero subscript). There are three endogenous variables, K, L, and Q. 
However output Q is in turn a function of K and / via the production function 


Q= AKL) 


We shall assume it to be a Cobb-Douglas function (further discussed in Sec. 12.6) of the 
form 


Qa L*Ke 
where « and £ are positive parameters. If we further assume decreasing returns to scale, 
then a + 8 < 1, For simplicity, we shall consider the symmetric case where « = 6 < ; 


Qa l*K* (11,35) 
Substituting (11.35) into (11.34) gives us 
w(K, L) = PL°K — wh —rK 


The first-order condition for profit maximization is 


HF a pat iK#—weo 

aa (11.36) 
— = Pal K™ 17 =0 

aK 


This system of equations defines the optimal { and for profit maximization. But first, let us 
check the second-order condition to verify that we do have a maximum. 
The Hessian for this problem is 


Met MLK 
|H|= 








_ | Pate — 1)be? Ke Parte tke) 
TRL TKR Por tet Pata —1)LeKe ? 
The sufficient condition for a maximum is that |Hi| < 0 and |H| > 0: 

14) | = Pala — 1)L*?K* <0 
[A] = Peer — 1)2b20-2 22-2 _ p2yy 2a 2K 20 2 
= PPy? p22 K 20-27] _ D9) > 0 
Therefore, fora < 3, the second-order sufficient condition is satisfied. 


We can now return to the first-order condition to salve for the optimal K and L. Rewrit- 
ing the first equation in (11.36) to isolate K, we get 


Pal? 'K® =w 
1 


w a 
K= — pl 
(i) 


338 Part Four Optimization Problems 


Example 6 


Substituting this into the second equation of (11.36), we have 


fete 
rare rarat[(z-) | -r=0 
" 


Phat yeep QanWie— - 


or 


Rearranging to solve for £ then gives us 
Lt =(Pawt Iraylt-20 

Taking advantage of the symmetry of the model, we can quickly write the optimal K as 
Ke = (Part wyli-20 


L* and K* are the firm’s input demand equations. 
If we substitute L* and X* into the production function, we find that 


Qt =a(LKy 
= 1, -ayas(1-20) ol yee {1 -2u) 
(Paw ry (Por! wy 
af(-2a) 


2p2 
= (SF) (11.37) 


we 


This gives us an expression for the optimal output as a function of the exogenous variables 
Pw, andr. 


Let us assume the following circumstances: (1) Two inputs a and b are used in the produc- 
tion of a single product Q of a hypothetical firm. (2) The prices of both inputs, Pa and Pp, 
are beyond the control of the firm, as is the output price P; here we shall denote them 
by Poo, Pao, and Po, respectively. (3) The production process takes fo years (fg being some 
positive constant) to complete; thus the revenue from sales should be duly discounted 
before it can be properly compared with the cost of production incurred at the present 
time. The rate of discount, on 4 continuous basis, is assumed to be given at ro. 

Upon assumption 1, we can write a general production function Q = Q(a, 6), with mar- 
ginal physical products Qg and Q,. Assumption 2 enables us to express the total cost as 


C = Pood + Poob 
and the total revenue as 
R = PoQ(a, b) 
To write the profit function, however, we must first discount the revenue by multiplying it 


by the constant e-70'0__which, to avoid complicated superscripts with subscripts, we shall 
write as é~"'. Thus, the profit function is 


x = Py Q(a, bye" — Papa — Prob 


in which a and b are the only choice variables. 
To maximize profit, it is necessary that the first partial derivatives 


an 
Ha (- =) = PoQe’  — Pag 

. (11.38) 
2b (= =) = Po Quen” — Poo 


FIGURE 11.11 


Chapter 11 The Case of Mote than One Choice Variable 339 


both be zero. This means that 
PoQne! = Pag = and Py Que = Pag (11.39) 


Since Po Qz (the price of the product times the marginal product of input a) represents the 
value of marginal product of input a (VMP4), the first equation merely says that the present 
value of VMP, should be equated to the given price of input a. The second equation Is the 
same prerequisite applied to input b. 

Note that, to satisfy (11.39), both marginal physical products Q, and Q, must be 
positive, because Po, Poo, Poo, and e-’ all have positive values. This has an important inter- 
pretation in terms of an isoquant, defined as the locus of Input combinations that yield the 
same output level. When plotted in the ab plane, isoquants will generally appear like those 
drawn in Fig. 11.11. Inasmuch as each of them pertains to a fixed output level, along any 
isoquant we must have 


dQ=Q.da+ Q,db=0 
which implies that the slope of an isoquant is expressible as 


db Qu MPP, 
da Qe (- vr) (1-49) 


Thus, to have both Q, and Q, positive is to confine the firm’s input choice to the nega- 
tively sloped segments of the isoquants only, In Fig. 11.11, the relevant region of aperation 
is accordingly restricted to the shaded area defined by the two so-called ridge lines. Outside 
the shaded area, where the isoquants are characterized by positive slopes, the marginal 
product of one input must be negative. The movement from the input combination at M to 
the one at N, for instance, indicates that with input held constant the increase in input a 
leads us to a lower isoquant (a smaller output); thus, Qg must be negative. Similarly, a 
movement from M' to N’ illustrates the negativity of Q,. Note that when we confine our 
attention to the shaded area, each isoquant can be taken as a function of the form b = (a), 
because for every admissible value of a, the isoquant determines a unique value of b. 





b 


Tsoquants 





340 Part Four Optimization Problems 


The second-order condition revolves around the second partial derivatives of 7, obtain- 
able from (11.38). Bearing in mind that Q, and Qp, being derivatives, are themselves func- 
tions of the variables @ and b, we can find 29a, Tab = Ther ANC ep, and arrange them into a 
Hessian: 


PoQace" Po Qare' 


aa Fob| _ 
PoQave Po Qnne 


Kab Tb 


IH|= (11.41) 














For a stationary value of w to be a maximum, it is sufficient that 
|| <0 [that is, 4 < 0, which can occur iff Qua < 0} 
|H2| =|H] > 0 [that is, Taotoy > 73, which can occur iff Qoa Qh, > &,] 


Thus, we note, the second-order condition can be tested either with the 2;; derivatives or 
the Qj; derivatives, whichever are more convenient. 

The symbol Que denotes the rate of change of Qu (= MPPz) as input @ changes while 
input bis fixed; similarly, Qy, denotes the rate of change of Q, (= MPPz) as input b changes 
alone. So the second-order sufficient condition stipulates, in part, that the MPP of both 
inputs be diminishing at the chosen input levels a° and b*. Observe, however, that dimin- 
ishing MPP, and MPP» does not guarantee the satisfaction of the second-order condition, 
because the latter condition also involves the magnitude of Qap = Qua, which measures the 
tate of change of MPP of one input as the amount of the other input varies. 

Upon further examination it emerges that, just as the first-order condition specifies the 
isoquant to be negatively sloped at the chosen input combination (as shown in the shaded 
area of Fig. 11.11), the second-order sufficient condition serves to specify that same isoquant 
to be strictly convex at the chosen input combination. The curvature of the isoquant is asso- 
ciated with the sign of the second derivative d?b/da?. To obtain the latter, (11.40) must be 
differentiated totafly with respect to a, bearing in mind that both Qz and Q, are derivative 
functions of a and band yet, on an isoquant, bis itself a function of a; that is, 


Qo = Qala, = Qu = Qola, by = and = b= (a) 


The total differentiation thus proceeds as follows: 


2 
ab _¢ (- #2) =- --4 [a Qe = a] (11.42) 





da da\ Q Q 
Since b is a function of a on the isoquant, the total-derivative formula (8.9) gives us 
45 _ 809 do | Qs _ 
Wen 1B dat a0 8 g5 
4 db (11.43) 
a a 
Or _ 20s 22 = Owe ms Qa 


da ab da’ aa 


After substituting (11.40) into (11.43) and then substituting the latter into (11.42), we can 
rewrite the second derivative as 
db 1 
aa OB 7 | Se %- Qba Qo — Qav Qa + Quo QE (2)] 


= LQ? — 2Qz0l Qa) Qo) + Quel Qa)?] (11.44) 
b 


Chapter 11 The Cuse of More than One Choice Variable 341 


It is to be noted in (11.44) that the expression in brackets (last line) is a quadratic form in 
the two variables Q, and Q;. If the second-order sufficient condition is satisfied, so that 


Qea - Quo! 
-Qch — Qbp| o 


then, by virtue of (11.11'), the said quadratic form must be negative definite. This will in 
turn make a?b/da? positive, because Q, has been constrained to be positive by the first- 
order condition. Thus the satisfaction of the second-order sufficient condition means that 
the relevant (negatively sloped) isoquant is strictly convex at the chasen input combination, 
as was asserted, 

The concept of strict convexity, as applied to an isoquant & = #(@), which is drawn in the 
two-dimensional ab plane, should be carefully distinguished from the same concept as 
applied to the production function Q(a, b) itself, which is drawn in the three-dimensional 
abQ space. Note, in particular, that if we are to apply the concept of strict concavity or 
convexity to the production function in the present context, then, to produce the desired 
isoquant shape, the appropriate stipulation is that Q(a, b) be strictly concave in the 3-space 
(be dome-shaped), which is in sharp contradistinction to the stipulation that the relevant 
isoquant be strictly convex in the 2-space (be U-shaped, or shaped like a part of a U), 


Qoa < 0 and 











Example 7 Next, suppose that interest is compounded quarterly instead, at a given interest rate of ip 
————_ per quarter. Also suppose that the production process takes exactly a quarter of a year. The 
profit function then becomes 
mt = Py Q(a, b)1 + i9)7! — Pag — Puob 
The first-order condition is now found to be 
Po Qa(1 + ig)! - Pag =O 
PoQs(1 +i0)"'— Pro =0 
with an analytical interpretation entirely the same as in Example 6, except for the different 
manner of discounting. 
You can readily see that the same sufficient condition derived in Example 6 must apply 
here as well. 
EXERCISE 11.6 


1, #f the competitive firm of Example 1 has the cost function C = 2Q? + 203 instead, 
then: 
(a) Will the production of the two goods still be technically related? 
(6) What will be the new optimal levels of Q, and Q2? 
{q) What is the value of 212? What does this imply economically? 
2, A two-product firm faces the following demand and cost functions: 
Qi =40-2P,~ Pp Q2=35-P,- Pp Cm QF 4+2Q34+10 
(q) Find the output levels that satisfy the first-order condition for maximum profit, (Use 
fractions.) 
(b) Check the second-order sufficient condition. Can you conclude that this problem 
possesses a unique absolute maximum? 
{o) What is the maximal profit? 


342 Part Four Optimization Problemy 


3. On the basis of the equilibrium price and quantity in Example 4, calculate the point 
elasticity of demand |eqi| (for i = 1, 2). Which market has the highest and the lowest 
demand elasticities? 

4. If the cost function of Example 4 is changed to C = 204. 15Q+ g 
(a) Find the new marginal-cost function. 

(b) Find the new equilibrium quantities. (Use fractions.) 
(Q Find the new equilibrium prices. 
{d) Verify that the second-order sufficient condition is met. 

5. In Example 7, how would you rewrite the profit function if the following conditions 
hold? 

(a) Interest is compounded semiannually at ar interest rate of ig per annum, and the 
production process takes 1 year. 

(b) Interest is compounded quarterly at.an interest rate of ig per annum, and the pro- 
duction pracess takes 9 months. 

6, Given Q = Q(a, b), how would you express algebraically the isoquant for the output. 
level of, say, 2607 


11.7__Comparative-Static Aspects of Optimization 





Optimization, which is a special variety of static equilibrium analysis, is naturally also sub- 
ject to investigations of the comparative-static sort. The idea is, again, to find out how a 
change in any parameter will affect the cquilibrium position of the model, which in the 
present context refers to the optimal values of the choice variables (and the optimal value 
of the abjective function). Since no new technique is involved beyond those discussed in 
Part 3, we may proceed directly with some illustrations, based on the examples introduced 
in See. 11.6. 





Reduced-Form Solutions 

Example | of Sec. 11.6 contains two parameters (or exogenous variables), yg and Poy; it 
is not surprising, therefore, that the optimal output levels of this two-product firm are 
expressed strictly in terms of these parameters: 


4Pip — Pao _ 4Pwo = Pio 


Qi=—s— ad DE 


These are reduced-form solutions, and simple partial differentiation alone is sufficient to 
tell us all the comparative-static properties of the model, namely, 
aor 4 aay l a0; dQ3 4 





aPy 1S Py) OLS Py Pm OS 


For maximum profit, each product of the firm should be produced in a larger quantity if its 
market price rises or if the market price of the other product falls. 

Of course, these conchisions follow only from the particular assumptions of the model 
in question. We may point out, in particular, that the cffects of a change in Pio on Q3 and 


Chapter 11 The Caxe af More than One Choice Variable 343 


of P29 on Q7, are consequences of the assumed technical relation on the production side of 
these two commodities, and that in the absence of such a relation we shall have 


BOT _ 90; 
aPin DP y 


Moving on to Example 2, we note that the optimal output levels are there stated, numer- 
ically, as OF = 8 and QF = 73—n0 parameters appear. In fact, all the constants in the 
equations of the model are numerical rather than parametric, so that by thc time we reach 
the solution stage those constants have all lost their respective identities through the 
process of arithmetic manipulation. What this serves to underscore is the fundamental lack 
of generality in the use of numerical constants and the consequent lack of comparative- 
static content in the equilibrium solution. 

On the other hand, the reruse of numerical constants is no guarantee that a problem will 
automatically become amenable to comparative-static analysis. The price-diserimination 
problem (Example 3), for instance, was primarily set up for the study of the equilibrium 
(profit-maximization) condition, and no parameter was introduced at all. Accordingly, 
even though stated in terms of general functions, a reformulation will be necessary if a 
comparative-static study is contemplated. 





General-Function Models 

The input-decision problem of Example 6 illustrates the case where a general-function 
formulation docs embrace several parameters—in fact, no less than five (Pp, Po. Psy, 7; 
and f), where we have, as before, omitted the 0 subscripts from the exogenous variables rp 
and f. How do we derive the comparative-static propertics of this model? 

The answer lies again in the application of the implicit-function theorem. But, unlike the 
cases of nongoal-equilibrium models of the market or of national-income determination, 
where we worked with the equilibrium conditions of the model, the present context of goal 
equilibrium dictates that we work with the first-order conditions of optimization. For 
Example 6, these conditions are stated in (11.39). Collecting all terms in (11.39) to the left 
of the equals signs, and making explicit that Q, and Q; are both functions of the endoge- 
nous (choice) variables a and b, we can rewrite the first-order conditions in the format 
of (8.24) as follows: 





F'(a, by Po. Pan, Pityty t) = Po Qala, ble" — Pay = 0 
‘ 0: Pans Poors t) = Po Qala, b)e “a (11.45) 
F(a, bs Pos Pao, Pous rs t) = PoQpla, bye" — Pop = 0 


The functions F! and F? are assumed to possess continuous derivatives. Thus it would be 
possible to apply the implicit-function theorem, provided the Jacobian of this system with 
Tespect to the endogenous variables a and 4 does not vanish at the initial equilibrium. The 
said Jacobian turns out to be nothing but the Hessian determinant of the x function of 
Example 6: 

oF! aF! 

dab | PoQae "Po Qase” 
ar? art] | PoQare! Po Qore 
da ab 


lJ] = =|A| [by(11.41}] (11.46) 


344 PartFour Qptinization Problems 


Hence, if we assume that the second-order sufficient condition for profit-maximization is 
satisfied, then || must be positive, and so must be |/|, at the initial equilibrium or opti- 
mum. In that event, the implicit-function theorem will enable us to write the pair of implicit 


functions 
a = a"( Po, Paus Pros tt 
. . Os Fats PAO ) (11.47) 
BY = bY Po, Pao, Phos t) 
as well as the pair of identities 
PoQula”, b*)e7"' — Pap = 0 
Qala’, Oe a (11.48) 


PoQs(a", bYe'"" — Pig = 0 


To study the comparative statics of the model, first take the total differential of cach 
identity in (11.48), For the time being, we shall permit ail the exogenous variables to vary, 
so that the result of total differentiation will involve da*, db*, as well as dPo, dP, dPin, 
dr, and df. Lf we place on the left side of the equals sign only those terms involving da* and 
db*, the result will be 


PoOage da” + PyOye db? = ~ Qge"dPy + d Pan 

+ PoQate “dr + PyQare "dt 
PoQuve “da* + PrOme db’ =— Oye" dP + dPo 

+ PoOpte dr + PyOpre “dt 


(11.49) 


where, be it noted, the first and second derivatives of Q are all to be evaluated at the equi- 
librium, i.¢., at @* and 6* You will also note that the coefficients of da* and db* on the left 
are precisely the elements of the Jacobian in (11.46). 

To derive the specific comparative-static derivatives—of which there are a total of 10 
(why?)}—we now shall allow only a single cxogenous variable to vary at a time. Suppose we 
let Py vary, alonc. Then dPy # 0, but dP.9 = dPoo = dr = dt = 0, so that only the first 
term will remain on the right side of each equation in (11.49), Dividing through by ¢ Po, 
and interpreting the ratio da* /d Py 10 be the comparative-static derivative (da*/d Py), and 
similarly for the ratio /b* /d Py, we can write the matrix equation 


[Poe PoQase "| [Geer | _ | 


PoQave" —PaQmne || (Bb"/APo) | | - Ope" 


The solution, by Cramer's rule, 18 found to be 


(=) — (Or Qeb — Qu Oru) Poe 7" 








OP \7| 
11.50 
(Cr — (Qe Qar — Q5Qca) Pre" 6 ) 
aPo || 


[f you prefer, an alternative method is available for obtaining these results: You may simply 
differentiate the two identitics in (11.48) totally with respect to Py (while holding the other 
four exogenous variables fixed), bearing in mind that ¥y can affect a* and b* via (11.47). 


Chapter 11 The Case of More than One Choice Viriable 345 


Let us now analyze the signs of the comparative-static derivatives in (11.50). On the 
assumption that the second-order sufficient condition is satisfied, the Jacobian in the 
denominator must be positive. The second-order condition also implies that Og. and Qs, 
are negative, just as the first-order condition implies that @, and Q; are positive. Moreover. 
the expression Pye7" is certainly positive. Thus, if Q,, > 0 (if increasing one input will 
raise the MPP of the other input), we can conclude that both (8a*/8 Py} and (35* /8 Py) will 
be positive, implying that an increase in the product price will result in increased employ- 
ment of both inputs in equilibrium. If G., < 0, on the other hand, the sign of each deriva- 
tive in (11.50) will depend on the relative strength of the negative force and the positive 
force in the parenthetical expression on the right. 

Next, Ict the exogenous variable r vary, alone. Then all the terms on the right of (11.49) 
will vanish except those involving dr. Dividing through by dr 4 0, we now obtain the 
following matrix equation 


PoQeae" PyQape™ | | (Ba*/Br) | _ | PyQote 
PoQape"  PoQnee”" | | (ab"/ar) PyQpte" 


with the solution 





($=) — (QaQi — Be Qar)( Prev" 





ar Jl , (11.51) 
a — (Dy Qua ~ Gu Qub Poe "YP 
ar J J| 


Both of these comparative-static derivatives will be negative if Q,, is positive, but indeter- 
minate in sign if Q., is negative. 

By a similar procedure, we may find the effects of changes in the remaining parameters. 
Actually, in view of the symmetry between / and / in (1 1.48) it is immediately obvious that 
both (da*/at) and (36*/dt) must be similar in appearance to (11.51). 

The effects of changes in Pao and Pyy are left to you to analyze. As you will find, the sign 
restriction of the second-order sufficient condition will again be useful in evaluating the 
comparative-static derivatives, becausc it can tell us the signs of Q,, arid Opp as well as the 
Jacobian |J | at the initial equilibrium (optimum). Thus, aside from distinguishing between 
maximum and minimum, the second-order condition also has a vital role to play in the 
study of shifts in equilibrium positions as well. 





EXERCISE 11.7 
For Probs.1 through 3, assume that Qo, > 0. 


1, On the basis of the model described in (11.45) through (11.48), find the comparative 
static derivatives (80*/@ Pao) and (ab*/4P.9). Interpret the economic meaning of the 
result, Then analyze the effects on a* and &* of a change in Py. 

2. For the problem of Example 7 in Sec. 11.6: 

(a) How many parameters are there? Enumerate them. 
(b) Following the procedure described in (11,45) through (11.50), and assuming that 
the second-order sufficient condition is satisfied, find the comparative-static 


346 Part Four Optimization Problems 


derivatives (Ja*/i Pp) and (#b*/8 Po), Evaluate their signs and interpret their eco- 
nomic meanings. 

(© Find (#a7/di9) and (ab*/aio), evaluate their signs, and interpret their economic 
meanings. 

3. Show that the results in (11,50) can be obtained alternatively by differentiating the two 
identities in (11.48) totally with respect to Po, while holding the other exogenous vari- 
ables fixed, Bear in mind that Py can affect a* and b* by virtue of (11.47). 

4, A Jacobian determinant, as defined in (7.27), is made up of first-order partial deriva- 
tives. On the other hand, a Hessian determinant, as defined in. Secs. 11.3 and 11.4, has 
as its elements second-order partial derivatives. How, then, can it turn out that 
i |= AL, as in (11.46)? 

















Chapter 











Optimization with 
Equality Constraints 


Chapter 11 presented a general method for finding the relative extrema of an objective 
function of two or more choice variables, One important feature of that discussion is that 
all the choice variables are independent of one another, in the sense that the decision made 
regarding one variable does not impinge upon the choices of the remaining variables. For 
instance, a two-product firm can choose any value for Q) and any value for Q) it wishes, 
without the two choices limiting each other. 

ifthe said firm is somehow required to observe a restriction (such as a production quota) 
in the form of Q) + G2 = 950, however, the independence between the choice variables 
will be lost. In that event, the firm’s profil-maximizing output levels Q} and Q} will be not 
only simultaneous but aiso dependent, because the higher Qf is, the lower Q} must corre- 
spondingly be, in order to stay within the combined quota of 950. The new optimum satis- 
fying the production quota constitutes a constrained optimum, which, in general, may be 
expected to differ from the free optimum discussed in Chap. 11, 

A restriction, such as the production quota mentioned before, establishes a relationship 
between the two variables in their roles as choice variables, but this should be distinguished 
from other types of relationships that may link the variables together. For instance, in Ex- 
ample 2 of Sec. 11,6, the two products of the firm are related in consumption (substitutes) 
as well as in production (as is reflected in the cost function), but that fact docs not qualify 
the problem as onc of constrained optimization, since the two output variables are still in- 
dependent as choice variables, Only the dependence of the variables qua choice variables 
gives rise to a constrained optimym, 

Tn the present chapter, we shall consider equality constraints only, such as Q; + Q2 = 
950. Our primary concern will be with redative constrained extrema, although absolute 
ones will also be discussed in Sec. 12.4. 


12.1 Effects of a Constraint 





The primary purpose of imposing a constraint is to give duc cognizance to certain limiting 
factors present in the optimization problem under discussion. 


347 


348 Part Four Optimization Problems 


FIGURE 12.1 


We have already seen the limitation on output choices that result from 4 production 
quota. For further illustration, let us consider a consumer with the simple utility (index) 
function 


U = xyx2 +20) (12.1) 


Since the marginal utilities—the partial derivatives U; = 0U/0.x, and U; = dU/dx2 are 
positive for all positive levels of x; and x2 here, to have U maximized without any con- 
straint, the consumer should purchase an infinite amount of both goods, a solution that 
obviously has little practical relevance. To render the optimization problem meaningful, the 
purchasing power of the consumer must also be taken into account; i.c., a budget consiraint 
should be incorporated into the problem. If the consumer intends to spend a given sum, say, 
$60, on the two goods and if the current prices are Pig = 4 and Py = 2, then the budget 
constraint can be expressed by the linear equation 


4x) + 2x. = 60 (12.2) 


Such a constraint, like the production quota referred to carlier, renders the choices of x} 
and x} mutually dependent. 

The problem now is to maximize (12.1), subject to the constraint stated in (12.2). Math- 
ematically, what the constraint (variously called restraini, side relation, ot subsidiary 
condition) docs is to narrow the domain, and hence the range of the objective function. The 
domain of (12.1) would normally be the set {(x1,.x2) | 4) 2 0, x2 = 0}. Graphically, the 
domain is represented by the nonnegative quadrant of the x22 plane in Fig. 12.1a. After 
the budget constraint (12.2) is added, however, we can admit only those values of the vari- 
ables which satisfy this latter equation, so that the domain is immediately reduced to the set 
of points lying on the budget line, This will automatically affect the range of the objective 
function, too; only that subset of the utility surface lying directly above the budget-constraint 
line will now be relevant. The said subset (a cross section of the surface) may look like the 
curve in Fig. 12.16, where U is plotted on the vertical axis, with the budget line of diagram 
a placed on the horizontal axis. Our interest, then, is only in locating the maximum on the 
curve in diagram 8. 

In general, for a function = = f(x, y), the difference between a constrained extremum 
and a free extremum may be i!lustrated in the three-dimensional graph of Fig. 12.2. The 
free extremum in this particular graph is the peak point of the entire dome, but the con- 
strained extremum is at the peak of the inverse U-shaped curve siluated on top of (i.c., lying 





30 






Budget line 
4x, + By ~ 60 


Budget line 





{a} (hy 


FIGURE 12.2 


Chapter 12 Optimization with Equatity Constraints 349 


Free maximum 


Constrained 
maximum 














Constraint 











directly above) the constraint line. In general, a constrained maximum can be expected to 
have a lower value than the free maximum, although, by coincidence, the (wo maxima may 
happen to have the same valuc. But the constrained maximum can never exceed the free 
maximum. 

It is interesting to note that, had we added another constraint intersecting the first con- 
straint at a single point in the xy plane, the two constraints together would have restricted 
the domain to thal single point. Then the locating of the extremum would become a trivial 
matter. In a meaningful problem, the number and the nature of the constraints should be 
such as to restrict, but not climinate, the possibility of choice. Generally, the number of 
constraints should be less than the number of choice variables. 


12.2 Finding the Stationary Values 





Even without any new technique of solution, the constraincd maximum in the simple 
example defined by (12.1) and (12.2) can easily be found. Since the constraint (12.2) 
implies 





4x, 


7 30-2n (12.2 


XQ 
we can combine the constraint with the objective function by substituting (12.2') into 
(12,1). The result is an objective function in one variable only: 

U = x1(30 — xy) + 2x) = 32x — 2a? 


which can be handled with the method already learned, By setting dU/dx, = 32 — 42; 
equal to zero, we get the solution xj = 8, which by virtue of (12.2') immediately leads to 
x} = 30 - 2(8) = 14, From (12.1), we can then find the stationary value U* = 128; and 


350 Part Four Optimization Problems 


since the second derivative is d?U/ Jax} =—4 < 0, that stationary value constitutes a (con- 
strained) maximum of U.t 

When the constraint is itself a complicated function, or when there are several con- 
straints to consider, however, the technique of substitution and climinalion of variables 
could become a burdensome task. More importantly, when the constraint comes in a form 
such that we cannot solve it to express one variable (x2) as an explicit function of the other 
(<)), the elimination method would in fact be of no avail—even if x2 were known to be 
an implicit function of x), that is, even if the conditions of the implicit-function theorem 
wete satisfied. In such cases, we may resort to. a methad known as the method of Lagrange 
(undetermined) multiplier, which, as we shall see, has distinct analytical advantages. 





Lagrange-Multiplier Method 

The essence of the Lagrange-multiplier method is 10 convert a constrained-cxtremum prob- 
Jem into a form such that the first-order condition of the free-extremum problem can still 
be applied. 

Given the problem of maximizing U = xx + 2x, subject to the constraint 4.7) + 
2x2 = 60 [from (12.1) and (12.2)], let us write what is referred to as the Lagrangian func- 
tion, which is a modified version of the objective function that incorporates the constraint 
as fallows: 


Z =X iXq + 2xy + 4(60 — 4x) — Axo) (12.3) 


The symbol A (the Greek letter lambda), representing some as yel undetermined number, is 
called a Lagrange (undetermined) multiplier If we can somehow be assured that 4x, + 
2x2 = 60, so that the constraint will be satisfied, then the last term in (12.3) will vanish 
regardless of the value of 4. In that event, Z will be identical with U. Morcover, with the 
constraint out of the way, we only have to seek the free maximum of Z, in licu of the 
constrained maximum. of U, with respect to the two variables x, and x2. ‘The question is: 
How can we make the parenthetical expression in (12,3) vanish? 

The tactic that will accomplish this is simply to treat A as an additional choice variable 
in (12.3), i@., to consider Z = Z(A, 41,42). For then the first-order condition for free 
extremum will consist of the set of simultaneous cquations 


Z,(= 82/04) = 60 — 4x — 2x, = 0 
2 = 9Z/9x)) = 2-42 =0 (12.4) 
Ay = aZ/ax) = x, —24 =0 





and the first equation will automatically guarantee the satisfaction of the constraint. Thus, 
by incorporating the constraint into the Lagrangian function Z and by treating the Lagrange 
multiplier as an extra variable, we can obtain the constrained extremum U™ (two choice 
variables) simply by screening the stationary values of Z, taken as a free function of three 
choice variables. 

Solving (12.4) for the critical values of the variables, we find xf = 8, x3 = [4 (and 
\* = 4). As expected, the values of x* and x} check with the answers already obtained by 


* You may recall that for the flower-bed problem of Exercise 9.4-2 the same technique of substitution 
was applied to find the maximum area, using a constraint (the available quantity of wire netting) to 
eliminate one of the two variables (the length or the width of the flower bed). 


Example 1 


Chapter 12. Optimization with Equality Constraints 354 


the substitution method. Furthermore, it is clear from (12.3) that Z* = 128; this is identi- 
cal with the value of U’* found earlier, as it should be. 
In general, given an objective function 


z= f(xy) (12.5) 
subject to the constraint 
a(x y)=e (12.6) 
where ¢ is a constant,” we can write the Lagrangian function as 
Z= f(x,y) +Ale— glx, y)] (12.7) 


For stationary values of Z, regarded as a function of the three variables 4, x, and y, the 
necessary condition is 

Z=c—g(x,y)=0 

Z,= fem )gr =0 (12.8) 


Z, = frm Agy = 0 





Since the first equation in (12.8) is simply a restatement of (12.6), the stationary values of 
the Lagrangian function Z will automatically satisfy the constraint of the original function 
z. And since the expression 4[¢ — g(x, y)] is now assuredly zero, the stationary valucs of Z 
in (12.7) must be identical with those of (12.5), subject to (12.6). 

Let us illustrate the method with two more examples. 


Find the extremum of 
z=xy  subjectto x+ty=6 
The first step is to write the Lagrangian function 
Z=xy+a(e-x-y) 


For a stationary value of Z, it is necessary that 


2,=6-x-y=0 x+y=6 
Zy=y-A=0 or -h +y=0 
Zy=x-h=0 —A+x =0 


Thus, by Cramer's rule or some other method, we can find 





Maz wes Yad 

The stationary value is Z* = 2 = 9, which needs to be tested against a second-order con- 
dition before we can tell whether it is a maximum or minimum (or neither). That will be 
taken up in Sec, 12.3. 


‘Iris also possible to subsume the constant c under the contraint function so that (12.6) appears 
instead as G(x, y) = 0, where C(x, y) = g(x, y) — c. in that case, (12.7) should be changed to Z = 
F(x, y) + AO — G(x, yy] = F(x, ¥) — G(x, y}. The version in (12.6) is chosen because it facilitates the 
study of the comparative-static effect of a change in the constraint constant later [see (12.16)]. 


352 Part Four Oprimization Problems 


Example 2 


Find the extremum of 
z=x+x — subjectto x +4x2=2 
The Lagrangian function is 
Zaxetxd +i(2~ 1 — 4x2) 


for which the necessary condition for a stationary value is 


2,=2-x-4n=0 xi t+4x2 = 2 
2) =2x, -1 =0 or A+ 2x =0 
22 = 2x, -44 =0 —44, +2x.=0 


The stationary value of Z, defined by the solution 


ye 4 4 2 4 8 
a7 M7 257 


is therefore Z* = z* = 4. Again, a second-order condition should be consulted before we 
can tell whether z* is a maximum or a minimum. 


Total-Differential Approach 
In the discussion of the free extremum of z = f(x, ¥), it was learned that the first-order 
necessary condition may be stated in terms of the total differential dz as follows: 


dz= fpdx+ fydy=0 (12.9) 


This statement remains valid after a constraint g(x, y) =c is added. However, with the 
constraint in the picture, we can no longer take both dx and dy as “arbitrary” changes as 
before. For if g(x, y} = c, then dg must be equal to de, which is zero since ¢ is 4 constant. 
Hence, 





(12.10) 


and this relation makes dx and dy dependent on each other, The first-order necessary con- 
dition therefore becomes dz = 0 [(12.9)], subject to g =c, and hence also subject to 
dg = 0 [(12.10)]. By visual inspection of (12.9) and (12.10), it should be clear that, in 
order to satisfy this necessary condition, we must have 

f = b (12.11) 

Sx By 
This result can be verified by solving (12.10) for dy and substituting the result into (12.9). 
The condition (12.11), together with the constraint g(x, y) = ¢, will provide two equations 
from which to find the critical values of x and y.” 

Does the total-diffcrential approach yield the same first-order condition as the Lagrange- 

multiplier method? Let us compare (12.8) with the result just obtained. The first equation 


(dy =)gy dx + By) 


* Note that the constraint g = cis still to be considered along with (12.11), even though we have 
utilized the equation dg = 0—that is, (12.10)—in deriving (12.11). While g = c necessarily implies 
dg = 0, the converse is not true: dg = 0 merely implies g = a constant (not necessarily ¢). Unless the 
constraint is explicitly considered, therefore, some information will be unwittingly left out of the 
problem 


Chapter 12 Optimization with Equality Constraints. 353 


in (12.8) merely repeats the constraint; the new result requires its satisfaction also. The last 
two equations in (12.8) can be rewritten, respectively, as 


fly and & ry (12.11) 

& Bp 
and these convey precisely the same information as (12.11). Note, however, that whereas 
the total-differential approach yields only the values of x* and y*, the Lagrange-multiplier 
method also gives the value of 4* as a direct by-product. As it turns out, 4” provides a mea- 
sure of the sensitivity of Z* (and z*) to a shift of the constraint, as we shall presently 
demonstrat¢. Therefore, the Lagrange-multiplier method offers the advantage of containing 
certain built-in comparative-static information in the solution, 





An Interpretation of the Lagrange Multiplier 

To show that 4* indeed measures the sensitivity of Z* to changes in the constraint, let us 
perform a comparative-static analysis on the first-order condition (12.8). Since A, x, and y 
arc endogenous, the only available exogenous variable is the constraint parameter c. A 
change in c would cause a shift of the constraint curve in the xy plane and thereby alter the 
optimal solution. In particular, the effect of an increase inc (a larger budget, or a larger pro- 
duction quota) would indicate how the optimal solution is affected by a relaxation of the 
constraint. 

To do the comparative-static analysis, we again resort to the implicit-function theorem. 
Taking the three equations in {12.8} to be in the form of F/(A, 4,710) =0 (with 
j = 1,2,3), and assuming them to have continuous partial derivatives, we must first check 
that the following endogenous-variabie Jacobian (where fry = fix, and gry = Syx) 
ar! oF! oF! 

dA dx dy 
apd apd yp? 
ie OR OE) mee fac Mer fey Ate | (12.12) 


ah Oxy ‘ ee 
oF) aF’ aF 8) fey — key fe — Akey 





does not vanish in the optimal state. At this moment, there is certainly no inkling that this 
would be the case. But our previous experience with the comparative statics of optimiza- 
tion problems [see the discussion of (11.46)] would suggest that this Jacobian is closely re- 
lated to the second-order sufficient condition, and that if the sufficient condition is satisfied, 
then the Jacobian will be nonzero at the equilibrium (optimum). Leaving the full demon- 
stration of this fact to Sec, 12,3, let us proceed on the assumption that |.7| 4 0, [f so, then 
we can express A*, x, and p* all as implicit functions of the parameter ¢: 


M=Me)  xt=x*(c) and pt =y*(e) (12.13) 





all of which will have continuous derivatives. Also, we have the cquilibrium identities 
e-ge vy =0 
Foy") = Max y= 0 (12.14) 
FOV) — Mayr, v*) S00 


354 Part Four Optimization Problems 


Now since the optimal value of Z depends on 4*, x*, and y*, that is, 
Za fv ytate— gy] (12.15) 


we may, in view of (12.13), consider Z* to be a function of ¢ alone. Differentiating Z* 
totally with respect to ¢, we find 











dz _ ay dy" re Lon dx* dy* 
de hn dc We He- gO] de A (' ate te 
dx* dy* dh 
(fp 1% ate ee te — otek yy ee 4 ae 
= (fe — 885) Fo ty — 8) + le = BO PT + 


where fr, fy. &x, and gy are all to be evaluated at the optimum. By (12.14), however, the 
first three terms on the right will all drop out. Thus we are left with the simple result 
dZ* 
de 
which validates our claim that the solution value of the Lagrange multiplier constitutes a 
measure of the effect of a change in the constraint via the patameter c on the optimal value 
of the objective function. 
A word of caution, however, is perhaps in order here. For this interpretation of 4*, you 
must express Z specifically as in (12.7). In particular, write the last term as Afe — g(x, y)], 
not alg(x, y) — e]. 





ae (12.16) 


n-Variable and Multiconstraint Cases 


The generalization of the Lagrange-multiplier method to 2 variables can be easily carried 
out if we write the choice variables in subscript notation. The objective function will then 
be in the form 


2= fO ayn) 
subject to the constraint 
B41 M2, In) SC 
It follows that the Lagrangian function will be 
Z = fx1, fa, 00.5 4n) FAO — (x1, 2,-2.8] 


for which the first-order condition will consist of the following (+ 1) simultaneous 
equations: 

Zy = 6 — 941,22, 4e) =O 

Z\ = fi-igi =0 

2, = frig =0 





Zn = fan =O 


Again, the first of these equations will assure us that the constraint is met, even though we 
are to focus our attention on the free Lagrangian function. 


Chapter 12 Optimization with Kqualin: Constrains 355 


When there is more than one constraint, the Lagrange-tmultiplier method is equally applic- 
able, provided we introduce as many such multipliers as there are constraints in the Lagrangian 
function. Let an n-variable function be subject simultaneously (o the two constraints 


LUM XI, KA) HO and A(xi.2a,-..,X,) = 
Then, adopting 4 and jz (the Greek letter mu) as the two undetermined multipliers. we may 
construct a Lagrangian function as follows: 


Z = fle a2. 0 %a) + Ale — g(a, ¥2.-... Se ~ eld — Alara] 
This function will have the same value as the original objective function f if both con- 
straints arc satisfied, i.c., if the last two terms tn the Lagrangian function both vanish. 
Considering 4 and j2 as choice variables, we now count (7 + 2) variables, thus the first- 
order condition will in this case consist of the following (n + 2) simullancous equations: 
Z, =O — g(X1.42,. 2. 4n) =O 
Zy = d — Ati ay tn) = 0 
Z=fi-agi- why =0 | (F=1,2,...,7) 
These should normally enable us to solve for all the x; as well as 4 and je. As before, the 
first two equations of the necessary condition represent essentially a mere restatement of 
the two constraints. 





EXERCISE 12,2 


1. Use the Lagrange-multiplier method to find the stationary values of z: 

(a) z= xy, subject to x + 2y = 2. 

(b) z= x(y +4), subject to x+ y= 8. 

(Q z= x-3y—xy, subject toxt+y=6. 
(a) 2=7 ~y+x?, subject to x+ y=0. 

2. tn Prob. 1, find whether a slight relaxation of the constraint will increase or decrease 
the optimal value of z. At what rate? 

3. Write the Lagrangian function and the first-order condition for stationary values (with- 
out solving the equations) for each of the following: 

(@) z= x+2y +4 3w4 xy — yw, subject to x+ y+ 2w= 10. 
(b) z= x2 + 2xy + yw, subject to 2x + y tw? = 24 and xtw=8 

4. If, instead of g(x, y) = ¢, the constraint is written in the form of G(x, y) = 0, how 
should the Lagrangian function and the first-order condition be modified as a 
consequence? 

5. In discussing the total-differential approach, it was pointed out that, given the con- 
straint g(x, ) =c, we may deduce that dg = 0. By the same token, we can further 
deduce that d*g = d(dg) = d(0) = 0. Yet, in our earlier discussion of the unconstrained 
extremum of a function 2= f(x, y}, we had a situation where dz = 0 is accompanied 
by either a positive definite or a negative definite d2z, rather than d?z = 0. How would 
you account for this disparity of treatment in the two cases? 

6. If the Lagrangian: function is written as Z= f(x, y) + Alg(x, Y) — ¢h rather than as in 
(12.7), can we still interpret the Lagrange multiplier as in (12.16)? Give the new inter- 
pretation, if any. 


356 Part four Optimization Problems 


12.3 Second-Order Conditions 


The introduction of a Lagrange multiplier as an additional variable makes it possible to 
apply to the constrained-cxtremum problem the same first-order condition used in the ftee- 
extremum problem. It is templing to go a step further and borrow the second-order 
necessary-and-sufficient conditions as well. This, however, should not be done. For even 
though Z* is indeed a standard type of extremum with respect to the choice variables, it is 
not so with respect to the Lagrange multiplier. Specifically, we can see from (12.15) that, 
unlike x* and »*.if* is replaced by any other valuc of 4, no effect will be produced on 2*, 
since [¢ — g{x*, y*)] is identically zero. Thus the role played by A in the optimal solution 
differs basically from that of x and y.' While it is harmless to treat 4 as just another choice 
variable in the discussion of the first-order condition, we must be careful not to apply 
blindly the second-order conditions developed for the free-extremum problem to the pre- 
sent constrained case. Rather, we must derive a set of new ones. As we shall see, the new 
conditions can again be stated in terms of the second-order total differential d7z, However, 
the presence of the constraint will cntail certain significant modifications of the criterion. 





Second-Order Total Differential 
It has been mentioned that, inasmuch as the constraint g(x, y) = ¢ means dg = g, dx + 
g, dy = 0, as in (12,10), dv and dy no longer are both arbitrary. We may, of course, still take 
(say) dx as an arbitrary change, but then dy must be regarded as dependent on dr, always to 
be chosen so as to satisfy (12.10), i.e., to satisfy dy = —(g,/g,) dx. Viewed differently, 
once the value of dx is specified, dy will depend on g, and g,, but since the latter deriva- 
tives in turn depend on the variables x and y, dy will also depend on x and y. Obviously, 
then, the earlier formula for dz in (11.6), which is based on the arbitrariness of both ax and 
dy, can no longer apply, 

To find an appropriate new expression for d?z, we must treat dy as a variable dependent 
on x and y during differentiation (if dx is to be considered a constant). Thus, 


a(dz) d(dz) 
ae Tt 7 d 





z=d(dz)= 





gy eet hs dy) dx to fs i, dy)dy 


yee : we) oars ret a) av 


d 
= fad? + fadyde tf “ OV get frdedy+ fod? +f, ay 








1 Ina more general framework of constrained optimization known as “nonlinear programming,” 

to be discussed in Chap. 13, it will be shown that, with inequality constraints, if Z* is a maximum 
(minimum) with respect to x and y, then it will in fact be a minimum (maximum) with respect to A. 
tn other words, the point (A*, x", y*) is a saddle point. The present case—where Z* is a genuine 
extremum wilh respect to x and y, but is invariant with respect to A—may be considered as a 
degenerate case of the saddle point. The saddle-point nature of the solution (4*, x, y*) also leads to 
the important concept of “duality.” But this subject is best to be pursued later. 


Chapter 12 Optimization with Equality Constraints 357 


Since the third and the sixth terms can be reduced to 


a(dy) a(dy) 
ra ox d+ ay 








«| = fydldy) = fry 


the desired expression for d?z is 
@2= fydx’ +2fydrdyt fydr they (12.17) 


which differs from (11.6) only by the last term, f, d’y. 

It should be noted that this last term is in the first degree [dy is nof the same as (dy)*]; 
thus its presence in (12.17) disqualifies d?z as a quadratic form. However, d*z can be trans- 
formed into a quadratic form by virtue of the constraint g(x, y) = c. Since the constraint 
implies dg = 0 and also dg = d(dg) = 0, then by the procedure used in obtaining (12.17) 
we cari get 


(Bg =e dx? + 2g,pdedyt gydy + gay =0 


Solving this last equation for dy and substituting the result in (12.17), we are able to elim- 
inate the first-degree expression dy and write d?z as the following quadratic form: 





Crs (4. - 2B as)as? +2 (1. bs ) aay + (1 - Ee.) dy 
v 1 


& 
Because of (12.11’), the first parenthetical coefficient is reducible to (f;; — Ages). and 
similarly for the other terms. However, by partially differentiating the derivatives in (12.8), 
you will find that the following second derivatives 

Bax = Sax — Me 

Lay = fay —h8xy = 2 yx (12,18) 

Zy = Sy — yy 
are precisely equal to these parenthetical cocfficients. Hence, by making usc of the La- 
grangian function, we can finally express d?z more neatly as follows: 


Pr= Zyde + Zydrdy 
+ Zyedydx + Zy dy? (12.17') 


The coefficients of (12.17’) are simply the second partial derivatives of Z with respect 
to the choice variables x and y; together, therefore, they can give rise to a Hessian 
determinant. 


Second-Order Conditions 

For a constrained extremum of z = f(x, y), subject to g(x, v) = c, the second-order nec- 
essary-and-sufficient conditions still revolve around the algebraic sign of the second-order 
total differential d?z, evaluated at a stationary point. However, there is one important 
change. In the present context, we are concerned with the sign definiteness or semidefi- 
niteness of d°z, not for all possible values of dx and dy (not both zero), but anfy for those 


358 Part Four Optimization Problems 


dx and dy values (not both zero) satisfying the linear constraint (12.10). gx + gydv = 0. 
Thus the second-order necessary conditions are 


For maximum of z: —_d?z negative semidefinite, subject to dg = 0 
For minimum of z dz positive scmidelinite, subject to dg = 0 
and the second-order sufficient conditions are 


For maximum of z: @°z negative definite, subject to dg = 0 
For minimum ofz: dz positive definite, subject to dg = 0 


In the following, we shall concentrate on the second-order sullicient conditions. 

Inasmuch as the (dx, dy) pairs satisfying the constraint g, dv + gy dy = 0 constitute 
merely a subset of the sct of all possible dx and dy, the constrained sign definiteness is less 
stringent—that is, easier to satisfy—than the unconstrained sign definiteness discussed in 
Chap. 11. In other words, the second-order sufficient condition for a constraincd-cxtremum 
problem is a weaker condition than that for a free-cxtremum problem. This is welcome 
news because, unlike necessary conditions which must be stringent in order to serve as 
effective screening devices. sufficient conditions should be weak to be truly serviceable! 


The Bordered Hessian 
As in the case of free extremum, it is possible to express the second-order sufficient condi- 
tion in determinantal form. In place of the Hessian determinant ||, however, in the 
constrained-extremum case we shall encounter what is known as a bordered Hessian. 

In preparation for the development of this idea, let us first analyze the conditions for the 
sign definitencss of a two-variable quadratic form, subject to a linear constraint, say, 


qzuaw +2huvt bu? — subjectto aw + By =0 


Since the constraint implies v = —(a/B)u, we can rewrite g as a function of one variable only: 


2 2 
2 a 2 wos 2 2 uw 
q saw — Dhow’ + bout = (af — 2hop + bo") 
B B B 
It is obvious that q is positive (negative) definite if and only if the expression in parenthe- 
ses is positive (negative). Now, it so happens that the following symmetric determinant 


da B 
a a h| =2hof — af? — ba? 
BoA 6 
is exactly the negative of the said parenthetical expression. Consequently, we can state that 
tg . Ow p 
._ | positive definite . _ . <0 
4 |e definite { SMbiecttoaw + Bu=0 iff 3 i j >0 
ND 


+4 million-dollar banik deposit” is clearly a sufficient condition for “being able to afford a steak 
dinner.” But the extremely limited appficability of that condition renders it practically useless. A more 
meaningful sufficient condition might be something like “fifty dollars in one’s wallet,” which is a 
much less stringent financial requirement. 


Example 1 


Chapter 12 Optimization with Kquality Consteuinty 359 


Tt is noteworthy that the determinant used in this criterion is nothing but the discriminant 


a 


of the original quadratic form | h kt , with a border placed on top and a similar border on 


b 
the left, Furthermore, the border is merely composed of the two coefficients o and from the 
constraint, plus a zero in the principal diagonal, This bordered discriminant is symmetric, 





Determine whether g=4u?+4uy+3¥2, subject to u—2v =0, is either positive or 


01 -2 
negative definite, We first form the bordered discriminant} 1 4 2], which is made 
22 3 








symmetric by splitting the coefficient of uv into two equal parts for insertion into the deter- 
minant. Inasmuch as the determinant has a negative value (—27), q must be positive 
definite. 


When applied to the quadratic form #2 in (12.17°), the variables w and v become dx 
Zax 2sy 
Zu Zu’ 
Moreover, the constraint to the quadratic form being g, dx + g,dy = 0, wehavea = g, and 
B = g,. Thus, for values of dx and dy that satisfy the said constraint, we now have the fol- 
lowing determinantal criterion for the sign definiteness of dz: 
<0 
| >0 


The determinant to the right, often referred to as a bordered Hessian, shall be denoted by 
|47|, where the bar on top symbolizes the border. On the basis of this, we may conclude that, 
given stationary value ofz = f(x, y) orof Z = f(x, y) +4[e — g(x, y)].a positive |H| 
is sufficient to establish it as a relative maximum of z; similarly, a negative |Al is sufficient 
to establish it as a minimum—all the derivatives involved in | H] being evaluated at the crit- 
ical values of x and 3, 

Now that we have derived the second-order sufficient condition, it is an easy matter 
to verify that, as earlier claimed, the satisfaction of this condition will guarantee that the 
endogenous-yariable Jacobian (12.12) does not vanish in the optimal state, Substituting 
(12.18) into (12.12), and multiplying both the first column and the first row of the Jacobian 
by —1 (which will leave the value of the determinant unaltered), we see that 








and dy, respectively, and the (plain} discriminant consists of the Hessian 


0 & & 
| subset o de =0 iff Be Lex Loy 


By LZyr Ly 


@zi positive definite 
negative definite 








0 & 
ldl=]ae 2s. (12.19) 
By Zyy 





That is, the endogenous-variable Jacobian is identical with the bordered Hessian—a result 
similar to (11.42) where it was shown that, in the free-extremum context, the endogenous- 
variable Jacobian is identical with the plain Hessian. [f, in fulfillment of the sufficient 
condition, we have |H’| 4 0 at the optimum, then |./| must also be nonzero. Consequently, 
in applying the implicit-function theorem to the present context, it would not be amiss to 
substitute the condition |/7| # 0 for the usual condition |./| # 0. This practice will be fol- 
lowed when we analyze the comparative statics of constrained-optimization problems in 
Sec, 12.5. 


360 Part Four Optimisation Problems 


Example 2 


Example 3 


Example 4 


Let us now return to Example 1 of Sec. 12.2 and ascertain whether the stationary vatue 
found there gives a maximum or a minimum. Since Z,; = y — 4 and Zy = x — 4, the second- 
order partial derivatives are Z,, = 0, Zyy = Zyy =1, and Z,, = 0. The border elements we 
need are gy = 1 and gy = 1. Thus we find that 


— yo 11 
\Hi=|1 0 1]/=2>0 
110 








which establishes the value z* = 9 as a maximum. 


Continuing on to Example 2 of Sec. 12.2, we see that Z) = 2x, ~ 4 and 2) = 2x, — 4). 
These yield 211 = 2, 212 = Za) = 0, and Zy2 = 2. From the constraint x; + 4x2 = 2, we ob- 
tain g, = 1 and gz = 4. It follows that the bordered Hessian is 

014 
120 
402 


\A| = =-34<0 








and the value z* = ,4 is a minimum. 


Consider a simple two-period model where a consumer's utility is a function of consump- 
tion in both periods. Let the consumer's utility function be 


U(m, x2) = 1X2 


where x; is consumption in period 1 and x2 is consumption in period 2. The consumer is 
also endowed with a budget 8 at the beginning of period 1. 

Let r denote a market interest rate at which the consumer can choose to borrow or lend 
across the two periods. The consumer's intertemporal budget constraint is that x; and the 
present value of x2 add up to B. Thus, 

x2 
x +m =8 
at T+ 
The Lagrangian for this utility maximization problem is 
Az 
ZaHx2+h(B-x - — 
1x2 + A ( Wa Ty :) 
with first-order conditions 





a re 





Combining the last two first-order equations to eliminate 4 gives us 





x 4 
Bl lige 
xm +h) 
Substituting this equation into the budget constraint then yields the solution 
»_ 8 ._ B+) 
v= z and x= 5 


Chapter 12 Optimization with Lquality Consteainty 361 


Next, we should check the second-order sufficient condition for a maximum. The bor- 
dered Hessian for this problem is 





0 
ltr 2 
Aja| - j-* 39 
1A] | 0 1 tar” 
-—— 0 
T+r 





Thus the second-order sufficient condition is satisfied for a maximum U. 


n-Variable Case 
When the objective function takes the form 


2= fx, ke) subjectto a(t dmb 


the second-order condition still hinges on the sign of d’z. Since the latter is a constrained 
quadratic form in the variables dx), dx2, ..., dx, subject to the relation 


(dg =)zi dxy + go dry +++ + Brdtn = 0 


the conditions for the positive or negative definiteness of dz again involve a bordered 
Hessian. But this time these conditions must be expressed in terms of the bordered leading 
principal minors of the Hessian, 

Given a bordered Hessian 


O gg ot ka 
B22 0 Zin 
IH =]g ZZ + Zon 
Be Znt 2na Lun 


its bordered leading principal minors can be defined as 


Oo ss & & 
_ oO gs B _ 7 
el=]q Zn Za) le |S 20 Zn 2a 


& Zn Zn 


(ete.} 
& 21 22 23 


3 23 2 2x 

with the last one being |A, | = |. In the newly introduced symbols, the horizontal bar 
above H again means bordered, and the subscript indicates the order of the leading princi- 
pal minor being bordered, For instance, | 4:| involves the second leading principal minor of 
the (plain) Hessian, bordered with 0, g1, and g2; and similarly for the others, The conditions 
for positive and negative definiteness of d?z are then 

|Hp), ||, ....1Hil <0 

|| > 0; | Fh] < 0; |Ayl > 0; etc. 





2, | positive definite | |. _ 
ari | negative definite subject to dg = 0 
In the former, all the bordered leading principal minors, starting with ||, must be nega- 
tive; in the latter, they must alternate in sign. As previously, a positive definite d*z is 


362 Part Four Optimization Problems 


TABLE 12.1 Detertminantal Test for Relative Constrained Extremum: 2 = f(x), +++ 5%,)s 
Subject to gv, Xy.+ +6 5%,) =} With Z =p ry. X,) + AL = 80 ky 











Condition Maximum Minimum 
First-order necessary condition =a h=lgs=- = ly=0 Zehahe = 
Second-order sufficient condition! |p| > 0; |si-< 0; (HARI... Fal < 0 


VAgh > O:...5(-1)1Aal > 0 





1 Applicable only after the first-order necessary condition has been satistied. 


sufficient to establish a stationary value of z as its minimum, whereas a negative definite 
d*2 is sufficient to establish it as a maximum. 

Drawing the threads of the discussion together, we may summarize the conditions for a 
constrained relative extremum in Table 12.1. You will recognize, however, that the criterion 
stated in the table is not complete. Because the second-order sufficient condition is not nec- 
essary, {allure to satisly the criteria stated does not preclude the possibility that the station- 
ary value is nonetheless a maximum or a minimum as the case may be, In many economic 
applications, however, this (relatively less stringent) second-order sufficient condition is 
either satisfied, or assumed to be satisfied, so that the information in the table is adequate. 
It should prove instructive for you to compare the results contained in Table 12.1 with those 
in Table 11.2 for the free-extremum case. 


Multiconstraint Case 

When more than one constraint appears in the problem, the sccond-order condition in- 
volves a Hessian with more chan one border. Suppose that there are # choice variables 
and m constraints (m < 2) of the form g/(x),....x,) = cj. Then the Lagrangian function 
will be 


Z= f(x. 





xn) + ale; 


and the bordered Hessian will appear as 





where g/ = dy! /dx, are the partial derivatives of the constraint functions, and the doubic~ 
subseripted Z symbols denote. as before, the second-order partial derivatives of the 
Lagrangian function, Note that we have partitioned the bordered Hessian into four areas 





Chapter 12 Optimizarion with Equality Constrainty 363 


for visual clarity. The upper-left area consists of zeros only, and the lower-right area is sim- 
ply the plain Hessian. The other two areas, containing the g! derivatives, bear a mirror- 
image relationship to each other with reference to the principal diagonal, thereby resulting 
in a symmetric array of elements in the entire bordercd Hessian, 

Various bordered leading principal minors can be formed from ||. The one that con- 
tains Z,) as the last element of its principal diagonal may be denoted by | H2|. as before, By 
including one more row and one more column, so that Z3; enters into the scene, we will 
have ||, and so forth. With this symbology, we can state the second-order suflicient con- 
dition in terms of the signs of the following (1 — »7) bordered leading principal minors: 








Amtile maak <-> |Aal(= LAD 


For a maximum of z, a sufficient condition is that these bordered leading principal minors 
alternate in sign, the sign of Fys4| being that of (—1)"*'. Fora minimum of z, a sullicicnt 
condition is that these principal minors al] take the same sign, namely, that of (— 1)". 

Note that it makes an important difference whether we have an odd or even number of 
constraints, because (—1) raised to an odd power will yield the opposite sign to the case of 
an even power, Note, also, that when s = 1, the condition just stated reduces to that pre- 
sented in Table 12.1, 





EXERCISE 12.3 


1. Use the bordered Hessian to determine whether the stationary value of z obtained in 
each part of Exercise 12.2-3 is a maximum or a minimum. 

2. In stating the second-order sufficient conditions for constrained maximum and mini- 
mum, we specified the algebraic signs of | Ma], !H3|, |H4l, etc., but not of [Hi]. Write out 
an appropriate expression for |/)/, and verify that it invariably takes the negative sign. 

3. Recalling Property II of determinants (Sec. 5.3), show that: 

(a) By appropriately interchanging two rows and/or two columns of |A2} and duly 
altering the sign of the determinant after each interchange, it can be transformed 


into 
2 212 
Zn Ln te 
n g 0 








(b) By a similar procedure, | 3} can be transformed into 


Zn 22 23 Hi 
fy <n 2a ge 
23 23. 233. gs 
n 9 9g 0 


What alternative way of “bordering” the principal minors of the Hessian do these 
results suggest? 

4, Write out the bordered Hessian for a constrained optimization problem with four 
choice variables and two constraints. Then:state specifically the second-order sufficient 
condition for a maximum and for a-minimum of z, respectively, 


364 Part Four Optimization Problems 


12.4 Quasiconcavity and Quasiconvexity 





FIGURE 12.3 


In Sec. 11.5 it was shown thal, for a problem of free extremum, a knowledge of the con- 
cavity or convexity of the objective function obviates the need to check the second-order 
condition. In the context of constrained optimization, it is again possible to dispense with 
the second-order condition if the surface or hypersurface has the appropriate type of con- 
figuration. But this time the desired configuration is quasiconcavity (rather than concavity) 
for a maximum, and quasiconvexity (rather than convexity) for a minimum. As we shall 
demonstrate, quasiconcavity (quasiconvexity) is a weaker condition than concavity (con- 
vexity). This is only to be expected, since the second-order sufficient condition to be dis- 
pensed with is also weaker for the constrained optimization problem (@7z definite in sign 
only for those dx; satisfying dg = 0) than for the free one (dz definite in sign for all dx;). 


Geometric Characterization 
Quasiconcavity and quasiconvexity, like concavily and convexity, can be either strict or 
nonstrict. We shall first present the geometric characterization of these concepts: 


Let # and v be any two distinct points in the domain (a convex set) of a function j, and Ict 
line segment wo in the domain give risc to arc MN on the graph of the function, such that 
point Vis higher than or equal in height to point 4. Then function fis said to be guasicon- 
cave (quasiconvex) if all points on arc MN other than M and N arc higher than or equal in 
height to point Mf (lower than or equal in height to point N). The function fis said to be 
sirictly quasiconcave (strictly quasiconvex) if all the points on are MN other than Mf and N 
are strictly higher than point M {strictly lower than point '’). 


Tt should be clear from this that any strictly quasiconcave (strictly quasiconvex) function is 
quasiconcave (quasiconvex), but the converse is not true. 

For a better grasp, let us examine the illustrations in Fig. 12.3, all drawn for the one- 
variable case. In Fig.12.3a, line segment uv in the domain gives rise to arc MN on the curve 
such that V’ is higher than 4. Since all the points between M and A on the said are are 
strictly higher than .¥, this particular are satisfics the condition for strict quasiconcavity. 
For the curve to qualify as strictly quasiconcave, however, al! possible (u, 2") pairs must have 
ares that satisfy the same condition. This is indeed the case for the function in Fig. 12.32. 
Note that this function also satisfies the condition for (nonstrict) quasiconcavity. But it fails 
the condition for quasiconvexity, because some points on are MN are higher than N, which 
is forbidden for a quasiconvex function. The function in Fig. 12.35 has the opposite con- 
figuration. All the points on are M‘N’ are lower than NV’, the higher of the lwo ends, and 
the same is true of all arcs that can be drawn. Thus the function in Fig. 12.30 is strictly 








(a) (by (eh 


Chapter 12 Optimization with Equality Constraints 365 


FIGURE 12.4 





















































0 oO 
(a) () 





quasiconvex. As you can verify, it also satisfies the condition for (nonstrict) quasiconvex- 
ity, but fails the condition for quasiconcavity. What distinguishes Fig. 12.3c is the presence 
of a horizontal line segment MN", where all the points have the same height. As a result, 
that line segment—and hence the entire curve—can only meet the condition for quasicon- 
cavity, but not strict quasiconcavity. 

Generally speaking, a quasiconcave function that is not also concave has a graph roughly 
shaped like a bell, or a portion thereof, and a quasiconvex function has a graph shaped like an 
inverted bell, or a portion thereof, On the dell, it is admissible (though not required) to have 
both concave and convex segments. This more permissive nature of the characterization 
makes quasiconcavity (quasiconvexity) a weaker condition than concavity (convexity). In 
Fig. 12.4, we contrast strict concavity against strict quasiconcavity for the two-variable case. 
As drawn, both surfaces depict increasing functions, as they contain only the ascending por- 
tions of'a dome and a bell, respectively. The surface in Fig. 12.4a is strictly concave, but the 
one in Fig. 12.46 is certainly not, since it contains convex portions near the base of the bell. 
Yet it is strictly quasiconcave; all the ares on the surface, exemplified by MN and iM’ N’, sat- 
isfy the condition that all the points on cach arc between the two end points are higher than the 
lower end point. Returning to Fig. 12.4a, we should note that the surface therein is also strictly 
quasiconcave. Although we have not drawn any illustrative arcs MN and M'N’ in Fig. 12.4a, 
it is not difficult to check that all possible arcs do indeed satisfy the condition for strict quasi- 
concavity. In general, a strictly concave function must be strictly quasiconcave, although the 
converse is not true. We shall demonstrate this more formally in the paragraphs that follow. 


Algebraic Definition 

The preceding geometric characterization can be translated into an algebraic definition for 
casier generalization to higher-dimensional cases: 

quasiconcave 
quasiconvex 
domain of f, and for 0 < 6 < 1, 


A function fis | iff, for any pair of distinct points w and v in the (convex-set) 


JO) flu) > fut eye | ae (12.20) 


366 Part Four Optimization Problems 


To adapt this definition to strict quasiconcavity and quasiconvexity, the two weak inequal- 


ities on the right should be changed into strict inequalities | = mM \ You may find it 
instructive to compare (12.20) with (11.20). 

From this definition, the following three theorems readily follow. These will be stated 
in terms of a function f(x), where x can be interpreted as a vector of variables, 


x=(x1,...,%n). 


Theorem 1 (negative of a function) If f(x) is quasiconcave (strictly quasiconcave), 
then — (x) is quasiconvex (strictly quasiconvex). 


Theorem II (concavity versus quasiconcavity) Any concave (convex) function is qua- 
siconcave (quasiconvex), but the converse is not true, Similarly, any strictly concave 
(strictly convex) function is strictly quasiconcave (strictly quasiconvex), but the converse is 
not true. 


Theorem IL (linear function) If f(x) is a linear function, then it is quasiconcave as 
well as quasiconvex. 


Theorem I follows from the fact that multiplying an inequality by —[ reverses the sense 
of inequality. Let f(x) be quasiconcave, with f(v) > f(u). Then, by (12.20), f[eu+ 
(1 —0)v] = f(u). As far as the function — f(x) is concerned, however, we have {after mul- 
tiplying the two inequalities through by -1) — f(w) = —f(v) and —f[@u + (1- 8] < 
— f(u). Interpreting ~ f(u) as the height of point NV, and — f(v) as the height of M, we see 
that the function — f(x) satisfies the condition for quasiconvexity in (12.20). This proves 
one of the four cases cited in Theorem I; the proofs for the other three are similar. 

For Theorem Il, we shall only prove that concavity implies quasiconeavity. Let f(x) be 
concave. Then, by (11.20), 


f[@ut+ (1 —6)u] = 4f0 +01 -@)F() 


Now assume that f(v) > f(u); then any weighted average of {(v) and f(x) cannot possi- 
bly be less than f(u), i.c., 


afta) +O — Of) > Fw) 
Combining these two results, we find that, by transitivity, 
flou+(1—a)v] > fl) — for f(r) = fw) 


which satisfies the definition of quasiconcavity in (12.20). Note, however, that the condi- 
tion for quasiconcavity cannot guarantee concavity. 

Once Theorem II is established, Theorem III follows immediately, We already know that 
a linear function is both concave and convex, though not strictly so. In view of Theorem II, 
a linear function must also be both quasiconcave and quasiconvex, though not strictly so. 

In the case of concave and convex functions, there is a useful theorem to the effect 
that the sum of concave (convex) functions is alsa concave (convex). Unfortunately, 
this theorem cannot be generalized to quasiconcave and quasiconvex functions. For 
instance, a sum of two quasiconcave functions is not necessarily quasiconcave (see 
Exercise 12.4-3). 


FIGURE 12.5 


Chapter 12 Optimization with Equality Constraints 367 








Example 1 





Sot $* Set s= Set S* 
@ ©) 


Sometimes it may prove casier to cheek quasiconcavity and quasiconvexity by the fol- 
lowing alternative definition: 
quasiconcave 


quasiconvex | iff, for any constant &, 


A function f(x), where x is a vector of variables, is | 


the set 


S- = tx| f@ 2 ay 


Selr| fayek is a convex set (12.21) 


The sets S* and S*, which are subsets of the domain, were introduced earlier (Fig. 11.10) to 
show that a convex function (or even a concave funetion) can give rise to a convex set. Here 
we are employing these two sets as tests for quasiconcavity and quasiconvexity, The three 
functions in Fig. 12.5 all contain concave as well as convex scgments and hence are neither 
convex nor concave. But the function in Fig. 12.5a is quasiconcave, because for any value of 
& (only one of which has been illustrated), the set S= is convex. The function in Fig. 12.5 is, 
on the other hand, quasiconvex since the set S= is convex. The function in Fig. 12.5c—a 
monotonic function—differs from the other two in that both S= and S* are convex sets. 
Hence that function is quasiconcave as well as quasiconvex. 

Note that while (12.21) can be used to check quasiconcavity and quasiconvexity, it is 
incapable of distinguishing between strict and nonstrict varieties of these properties. 
Note, also, that the defining properties in (12.21) are in themselves not sufficient for 
concavity and convexity, respectively. In particular, given a concave function which must 
perforce be quasiconcave, we can conclude that S is a convex set; but given that S® is a 
convex set, we can conclude only that the function fis guasfconcave (but not necessarily 
concave), 





Check z= x? (x > 0) for quasiconcavity and quasiconvexity. This function is easily verified 
geometrically to be convex, in fact strictly so. Hence it is quasiconvex. Interestingly, it is also 
quasiconcave. For its graph—the right half of a U-shaped curve, initiating from the point of 
origin and increasing at an increasing rate—is, similarly to Fig. 12.5¢, capable of generating 
a convex S= as well as a convex $*, 

If we wish to apply (12.20) instead, we first let u and v be any two distinct nonnegative 
values of x. Then 


fu)=u® Kyy=v? and fou + (1 -A)v] = out (1 — ov? 


368 Part Four Optimization Problems 


Example 2 


Example 3 


Suppose that f(v) > f(u), that is, v? > u?; then v = u, or more specifically, v > u (since u 
and v are distinct). inasmuch as the weighted average [@u+ (1 —#)v] must lie between 
wand v, we may write the continuous inequality 


v2 > (put (1 —@)vy? > uP ford<@<1 
or fv) > fleut+(1—@)v] > F(e) forO<6#<1 


By (12.20), this result makes the function f both quasiconcave and quasiconvex—indeed 
Strictly so. 


Show that z= f(x, y) = xy (with x, y = 0) is quasiconcave. We shall use the criterion in 
(12.21) and establish that the set S= = ((x, y) | xy = 4} is. a convex set for any k. For this 
purpose, we set xy = k to obtain an isovalue curve for each value of k. Like x and y, k should 
be nonnegative. In case k > 0, the isovalue curve is a rectangular hyperbola in the first 
quadrant of the xy plane, The set 5, consisting of all the points on or above a rectangular 
hyperbola, is a convex set. In the other case, with k = 0, the isovalue curve as defined by 
xy =0 is L-shaped, with the L coinciding with the nonnegative segments of the x and y 
axes. The set 5, consisting this time of the entire nonnegative quadrant, is again a convex 
set. Thus, by (12.21), the function z= xy (with x, y > 0) is quasiconcave. 

You should be careful not to confuse the shape of the isovalue curves xy = & (which is 
defined in the xy plane) with the shape of the surface z= xy (which is defined in the xyz 
space). The characteristic of the z surface (quasiconcave in 3-space) is what we wish to as- 
certain; the shape of the isovalue curves (convex in 2-space for positive k) is of interest here 
only as a means to delineate the sets 5= in order to apply the criterion in (12.21). 








Show that z= f(x, y) = (x— a)? +(y— 6) is quasiconvex, Let us again apply (12.21). 
Setting (x ~ a) + (y—b)’ =k, we see that k must be nonnegative. For each &, the iso- 
value curve is a circle in the xy plane with its center at (@, b) and with radius vk. Since 
S2 = (x, y) | (x —a) +(y- b)? < kj is the set of all points on or inside a circle, it consti- 
tutes a convex set. This is true even when k = 0—when the circle degenerates into a single 
point, (a, b}—since by convention a single point is considered as a convex set. Thus the 
given function is quasiconvex. 


Differentiable Functions 
The definitions (12.20) and (12.21) do not require differentiability of the function /. If fis 
differentiable, however, quasiconcavity and quasiconvexity can alternatively be defined in 
terms of its first derivatives: 

quasiconcave 


quasiconvex | iff, for any pair of 


A differentiable function of one variable, f(x), is | 
distinct points w and v in the domain, 


, Puyo — x) 
fv) = fw> [rane fee (12,22) 
Quasiconcavity and quasiconvexity will be sérict, if the weak inequality on the right is 
changed to the strict inequality > 0, When there are two or more independent variables, the 
definition is to be modified as follows: 


quasiconcave 


A differentiable function f(x1,...,22) i3 | 
quasiconvex 


| iff, for any two distinct points 


Chapter 12 Optimization with Equality Constraints 369 


w= (uj, ...,%,) andy =(v),..., Un) in the domain, 


Y Hedy - 4) 
Fl 


Y Herty —4) 
j=l 


20 (12.22) 


fa) = fi) => 


where f; = f/a.x;, to be evaluated at w or v as the case may be, 


Again, for strict quasiconcavity and quasiconvexity, the weak inequality on the right should 
be changed to the strict inequality > 0. 

Finally, if a function z = f(x), ...,%,) is twice continuously differentiable, quasicon- 
cavity and quasiconvexity can be checked by means of the first and second partial deriva- 
tives of the function, arranged into the bordered determinant 


|o fi pow 
fifi fix oe fin 
Bl=|f fr fx ne fin (12.23) 


en ar 


This bordered determinant resembles the bordered Hessian |#]| introduced in Sec. 12.3. 
But unlike the latter, the border in |B] is composed of the first derivatives of the function f 
rather than an extraneous constraint function g. It is because |B| depends exclusively on 
the derivatives of function f itself that we can use |8|, along with its leading principal 
minors: 





oh 0 fof 
\ail=|? A Bl=|f fn fio] Bel =1B) (12.24) 
pt fo fa fa 


to characterize the configuration of that function. 

We shall state here two conditions; one is necessary, and the othcr is sufficient, Both relate 
to quasiconcavity on a domain consisting only of the nonnegative orthant (the n-dimensional 
analog of the nonnegative quadrant), that is, with), ...,4, 2 0." 


Fotz = f(x),...,%») to be quasiconcave on the nonnegative orthant, it is mccessary that 
[B/<0, [Ble ..., [Bal | s joite a odd (12.25) 
= even 


wherever the partial derivatives are evaluated in (he nonnegative orthant. 


* Whereas concavity (convexity) of a function on a convex domain can always be extended to 
concavity (convexity) over the entire space, quasiconcavity and quasiconvexity cannot. For 
instance, our conclusions in Examples 1 and 2 will not hold if the variables are allowed to take 
negative values. The two conditions given here are based on Kenneth J. Arrow and Alain C. 
Enthoven, “Quasi-Concave Programming,” Econometrica, October 1961, p. 797 (Theorem 5), 
and Akira Takayama, Analytical Methods in Economics, University of Michigan Press, 1993, p. 65 
(Theorem 1.12). 


370 Part Four Optimization Problems 


Example 4 


A sufficient condition for f to be strictly quasiconcave on the nonnegative orthant is that 
<| nip, | odd 
[Bil <0, [Bol =O, 2.2, [Bal |= }oie ate (12.26) 


wherever the partial derivatives are cvaluated in the nonnegative orthant. 


Note that the condition |B)| < 0 in (12.25) is automatically satisfied because |B, | = -fPs 
it is listed here only for the sake of symmetry. So is the condition |By| < 0 in (12.26). 


The function z= f(xy, x2) = x1 x2 (x1, X2 = 0) is quasiconcave (cf. Example 2). We shall now 
check this by (12.22'). Let v= (un, ua) and v= (41, ¥2) be any two points in the domain. 
Then f(u) = uyu2 and f(v) = 4 ¥2. Assume that 


fW)= Fu) or vive = Uda (Vi, Ya, Ua, U2 2 0) (12.27) 


Since the partial derivatives of fare fy = x2 and fz = x1, (12.22') amounts to the condition 
that 


f(a) ~ th) + fo(u)(v2 — uz) = Ua v1 — un) + Ua (¥2 — U2) > 9 
of, upon rearrangement, 
ua(v) — uy) = ui(u2 — v2} (12.28) 


We need to consider four possibilities regarding the values of u, and up. First, if wi = 
uz = 0, then (12.28) is trivially satisfied. Second, if u; = 0 but uz > 0, then (12.28) reduces 
to the condition uzv > 0, which is again satisfied since up and v are both nonnegative. 
Third, if uy > 0 and uz =0, then (12.28) reduces to the condition 0 = —u Vz, which is still 
satisfied. Fourth and last, suppose that u and up are both positive, so that v) and v2 are also 
positive. Subtracting vzu; from both sides of (12.27), we obtain 


va(vy — un) > un(u2 — v2) (12.29) 


Three subpossibilities now present themselves: 
1. Ifu2 = v2, then 4, > uy. Infact, we should have v4 > u1 since (uy, U2) and (vj, v2) are dis- 
tinct points. The fact that uz = vz and v > 4 implies that condition (12.28) is satisfied. 


2. If uz > ¥2, then we must also have vy > uw by (12.29). Multiplying both sides of (12.29) 
by u2/v2, we get 


u, ou 
ua(Y) — ui) = suite — ¥z) > ui (ua — v2) [since > 1 (12,30) 
2 2 


Thus (12.28) is again satisfied. 

3. The final subpossibility is that uz < v2, implying that uz/v2 is a positive fraction. In 
this case, the first line of (12.30) still holds. The second line also holds, but now for a dif- 
ferent reason: a fraction (t2/¥2) of a negative number (uz — v2) is greater than the latter 
number itself. 


Example 5 


Chapter 12 Optimization with Equality Constraints 371 


Inasmuch (12.28) is satisfied in every possible situation that can arise, the function 
Z= xyX2 (x1, X2 = 0) is quasiconcave. Therefore, the necessary condition (12.25) should 
hold. Because the partial derivatives of fare 


fl=xm fh=x f= fy=0 fe=fy=1 


the relevant leading principal minors turn out to be 








QO x x 
o 2 1 
iml=| 7) = <0 [Bl=/—y 0 1)=2mm20 
x 0 
™ 1 0 








Thus (12.25) is indeed satisfied. Note, however, that the sufficient condition (12.26) is 
satisfied only over the positive orthant. 


Show that z= f{x, y) = x7y? (x, y> 0:0 < a,b < 1) is strictly quasiconcave. The partial 
derivatives of this function are 
fesaxt yh ty = bet! 
fx = O(a —V)a 2 hy = ye = bx YP | fy = b— Dxty?? 


Thus the leading principal minors of |8| have the following signs: 








19 fel pet ye 
[Bi| = fi fo|= fax™ yy’ <0 
Oo f 
1B2l=] fe fax fry] = [2° - a(a— 1b? — Pb 1))3-2 3? 5 
fy fyx fy 








This satisfies the sufficient condition for strict quasiconcavity in (12.26). 


A Further Look at the Bordered Hessian 
The bordered determinant | 8], as defined in (12.23), differs from the bordered Hessian 


0 goog. ke 
& Zy 4... Lin 


Hl =|e Zn Za... Lon 





Sn - Lan 
in two ways: (1) the border elements in | Bf arc the first-order partial derivatives of function 
J rather than g; and (2) the remaining clements in | B| are the second-order partial deriva- 
tives of f rather than the Lagrangian function Z, However, in the special case of a linear 
constraint equation, g(X1,...,%.) = aX) +--+ + dyX, =c—a case frequently encoun- 
tered in economics (see Sec. 12.5)—Z,, reduces to f;;. For then the Lagrangian function is 


Z = fly .-ytn) + Me — ain — + — at) 


so that 








i — ha and Zy = fy 


372 Part Four Optimization Problems 


Turning to the borders, we note that the linear constraint function yields the first derivative 
gj = 4). Moreover, when the first-order condition is satisfied, we have Z; = fj — Aa; = 9, 
so that f; = Aaj, or f; = Agy. Thus the border in |8| is simply that of [FI multiplied by a 
positive scalar 2. By factoring out A successively from the horizontal and vertical borders 
of || (see Sec. 5.3, Example 5), we have 


[8 = 2 ]HI 


Consequently, in the linear-constraint case, the two bordered determinants always possess 
the same sign at the stationary point of Z. By the same token, the !cading principal minots 
|B; | and || (i = 1, ..., 7) must also share the same sign at that point. It then follows that 
if the bordered determinant |B] satisfies the sufficient condition for strict quasiconcavity in 
(12.26), the bordered Hessian |H]| must then satisfy the second-order sufficient condition 
for constrained maximization in Table 12.1, 


Absolute versus Relative Extrema 

A more comprehensive picture of the relationship between quasiconcavity and second- 
order conditions is presented in Fig. 12.6. (A suitable modification will adapt the figure for 
quasiconvexity.) Constructed in the same spirit—and to be read in the same manner—as 
Fig. 11.5, this figure relates quasiconcavity to absolute as well as relative constrained max- 
ima of a twice-differentiable function z = f(x1...., 4). The three ovais in the upper part 
summarize the first- and second-order conditions for a relative constrained maximum. And 
the rectangles in the middle column, like those in Fig. 11.5, tie the concepts of relative 
maximum, absolute maximum, and unique absolute maximum to one another. 

But the really interesting information are those in the two diamonds and the elongated 
=> symbols passing through them. The one on the left tells us that, once the first-order con- 
dition is satisfied, and if the two provisos Listed in the diamond are also satisfied, we have a 
sufficient condition for an absolute constrained maximum. The first proviso is that the func- 
tion f be explicitly quasiconcave—a new term which we must hasten to define. 

A quasiconcave function f is explicitly quasiconcave if it has the further property that 


Flv) > $0) > fut (1—0)o] > fl) 


This defining property means that whenever a point on the surface, f{v}, is higher than an- 
other, f(w), then all the intermediate points—the points on the surface lying directly above 
line segment uv in the domain —must also be higher than f(z), What such a stipulation 
does is to rule oul any Aorizontal plane segments on the surface except for a plateau at the 
top of the surface.* Note that the condition for explicit quasiconcavity is not as strong as the 
condition for strict quasiconcavity, since the latter requires f[@u + (1 — @)v] > f{u) even 
for f(v) = f(z), implying that nonhorizontal plane segments are ruled out, too.! The other 


+ Let the surface contain a horizontal plane segment P such that f(u) € P and f(v) ¢ P. Then those 
intermediate points that are located on P will be of equal height to f(u), thereby violating the first 

proviso. 

+ Let the surface contain a slanted plane segment P‘ such that f(u) = f(v) are both located on P’. 

Then all the intermediate points will also be on P’ and be of equal height to f(u}, thereby violating 
the cited requirement for strict quasiconcavity, 


FIGURE 12.6 







definite at z* 


[second-order 













proviso in the teft-side diamond is that the set ((x),..., 22) | gQv1,-.- 


zis negative 
subject to dg = 0 


i, sufficient condition, 


Chapter 12 Optimization with Equality Constraints 373 








= fh) 
isa stationary 
value subject to 
BOR A = 
{first-order condition}, 












isa relative 
constrained 
maximum 












2 is an absolute 
constrained “= 
maximum = 

















2 is.a unique 
bsolute constrains 
maximem 










a2 ig negative 
semidefinite at z* 
subject to dg = 0 
{second-order 







F quasiconcave, and 























fis strictly” 





... the constraint 
set is convex; 









+n} = c} be con- 


vex. When both provisos are met, we shall be dealing with that portion of a bell-shaped, 
horizontal-segment-free surface (or hypersurface) lying directly above a convex set in the 
domain, A local maximum found on such a subset of the surface must be an absolute con- 


strained maximum, 


The diamond on the right in Fig. 12.6 involves the stronger condition of strict quasicon- 
cavity. A strictly quasiconcave function must be explicitly quasiconcave, although the con- 
verse is not true. Hence, when strict quasiconcavity replaces explicit quasiconcavity, an 


374 Part Four Optimization Problems 


absolute constrained maximum is stil] cnsured. But this time that absolute constrained 
maximum must also be unique, since the absence of any plane segment anywhere on the 
surface decidedly precludes the possibility of multiple constrained maxima. 





EXERCISE 12.4 
1. 


Draw a strictly quasiconcave curve z= f(x) which is 

(a) also quasiconvex (d) nat concave 

(b) not quasiconvex (2) neither concave nor convex 
(} not convex (f) both concave and convex 


. Are the following functions quasiconcave? Strictly so? First check graphically, and then 


algebraically by (12.20). Assume that x > 0. 
@itxyea  (bD fix)=atbe(b> 0) (fd satexr?(c<D) 


. (@) Let z= f(x) plot as a negatively sloped curve shaped like the right half of a bell in 


the first quadrant, passing through the points (0, 5), (2, 4), (3, 2), and G, 1). Let 
2= 9(x) plot as a positively sloped 45° line. Are f(x) and g(x) quasiconcave? 


(b) Now plot the sum f(x) + 9(x). Is the sum function quasiconcave? 


. By examining their graphs, and using (12.21), check whether the following functions 


are quasiconcave, quasiconvex, both, or neither: 
(@) f(xy xP — 2x (b) Fay, x2) = 6x) 9x2 (E) F(R, X2) = 2 — IN 


(a) Verify that a cubic function z= ax} + bx? + cx +d is in general neither quasicon- 


cave nor quasiconivex, 
() Is it possible to impose restrictions on the parameters such that the function be- 
comes both quasiconcave and quasiconvex for x > 0? 


6. Use (12,22) to check z= x2(x > 0) for quasiconcavity and quasiconvexity. 


« Show that 2= xy (, y > 0) is not quasiconvex. 
. Use bordered determinants to check the following functions for quasiconcavity and 


quasiconvexity: 
(zee (x, y> 0) (be zs—(xt 1)? -(y+2)? Ge y> 0) 


12.5 Utility Maximization and Consumer Demand 





The maximization of a utility function was cited in See, 12,1 as an example of constrained 
optimization, Let us now reexamine this problem in more detail. For simplicity, we shall 
still allow our hypothetical consumer the choice of only two goods, both of which have con- 
tinuous, positive marginal-utility functions. The prices of both goods are market-deter- 


mined, hence exogenous, although in this 





ction we shall omit the zero subscript from the 


price symbols. if the purchasing power of the consumer is a given amount 3 (for budget), 
the problem posed will be that of maximizing a smooth utility (index) function 


U =U, ¥) (Uy, Uy > 0) 


subject to 


xP, + yPy=B 


Chapter 12 Optimization with Equality Constraims 375 


First-Order Condition 
The Lagrangian function of this optimization model is 
Z = U(e, y) + MBA #P, — Py) 
As the first-order condition, we have the following set of simultaneous equations: 
Z,=B-xP,-yP,=0 
Z, =U, -AP, =0 (12,31) 
Zy = Uy —hP, = 0 


Since the last two equations are equivalent to 


(12.31) 





the first-order condition in effect calls for the satisfaction of (12.31'), subject to the budget 
constraint -the first equation in (12.31). What (12.3 1’) states is merely the familiar proposi- 
tion in classical consumer theory that, in order to maximize utility, consumers must allocate 
their budgets so as ta equalize the ratio of marginal utility to price for every commodity. 
Specifically, in the equilibrium or optimum, these ratios should have the common value A*. 
As we learned earlier, 4* measures the comparative-static effect of the constraint constant 
on the optimal value of the objective function. Hence, we have in the present context 
4* = (@U*/dB); that is, the optimal value of the Lagrange multiplier can be interpreted as 
the marginal utility of money (budget money) when the consumer's utility is maximized. 
If we restate the condition in (12.31') in the form 
Goa (12319 
Uy Py 
the first-order condition can be given an alternative interpretation, in terms of indifference 
curves. 
An indifference curve is defined as the locus of the combinations of x and y that will 
yield a constant level of U. This means that on an indifference curve we must find 


dU =U, dx + U, dy =0 


with the implication that dy/dx = —U/Uy. Accordingly, if we plot an indifference 
curve in the xy plane, as in Fig. 12.7, its slope, dvdr, must be equal to the negative of 
the marginal-utility ratio U,/Uy. (Since we assume U,, U, > 0, the slope of the indiffer- 
ence curve must be negative.) Note that U,/U,, the negative of the indifference-curve 
slope, is called the marginal rate of substitution between the two goods. 

What about the meaning of P,/P,'? As we shall presently see, this ratio represents the 
negative of the slope of the graph of the budget constraint. The budget constraint, 
xP, +yP, = B, can be written alternatively as 


years 


=3 P,” 


8o that, when plotted in the xy plane as in Fig. 12.7, it emerges as a straight line with slope 
— P,/P, (and vertical intercept B/ P,). 


376 Part Four Optimization Problems 


FIGURE 12,7 


N Indifference curves 


fee 
ere eH, 








Indifference 
curves 








Budget line 


dy 
(isye= 2 


i 


(a (by 





Budget 


of tine 











H 
° 


In this light, the new version of the first-order condition—(12.31”) plus the budget 
constraint—discloses that, to maximize utility, a consumer must allocate the budget such 
that the slope of the budget line (on which the consumer must remain) is equal to the slope 
of some indifference curve. This condition is met at point £ in Fig. 12.7a, where the budget 
line is tangent to an indifference curve, 


Second-Order Condition 
Lf the bordered Hessian in the present problem is positive, ie., if 


_ |O mh PB 
|B} =| Pp Uge Upp] = 2P,P,Uny — P2U ee — P2Uyy > 0 (12.32) 
Py Uy, Uy 


(with all the derivatives evaluated at the critical values x* and y*), then the stationary value 
of U will assuredly be a maximum. The presence of the derivatives U,,, Uyy, and Uzy in 
(12.32) clearly suggests that meeting this condition would entail certain restrictions on 
the utility function and, hence, on the shape of the indifference curves. What are these 
restrictions? 

Considering first the shape of the indifference curves, we can show that a positive |A| 
means the sirict convexity of the (downward-sloping) indifference curve at the point of tan- 
gency E. Just as the downward slope of an indifference curve is guaranteed by a negative 
dy/dx (= —U,/U,), ils strict convexity would be ensured by a positive d?y/dx?, To get 
the expression for d*y/dx?, we can differentiate —U,/U, with respect to x; but in doing 
so, we should bear in mind not only that both U, and U, (being derivatives) are functions 
ofx and y but also that, along a given indifference curve, v is itselfa function of x. Accord- 
ingly, both U, and U, can be considered as functions of x alone; therefore, we can get a 


total derivative 
@y d U, 1 dU, du, 
-4(_-)2-— (uv St -u 12.33 
dx? dx ( i) u; ( * dx dx ) ( ) 


Since x can affect U, and U, not only directly but also indirectly, via the intermediary of y, 
we have 








dU, dy dUy dy 
== Unies XU, y 34) 
qe Unt Ug Ge elo t Ung (123 ) 


Chapter 12 Optimization with Equality Constraints 377 


where dy/dx refers to the slope of the indifference curve, Now, at the point of tangency 
E—the only point relevant to the discussion of the second-order condition ~this slope is 
identical with that of the budget constraint; that is, dy/dx = —P,/ P,. Thus we can rewrite 
(12.34) as 











a =U, - Ue a = Ury — Uy ?, (12.34) 
Substituting (12.34'} into (12.33) and utilizing the information that 
U, = UP [from (12.31”)] 
and then factoring out U,/ 2, we can finaly transform (12.33) into 
Py _BWePUy— PiU PiUy IAL ap ggn 


dye UP? ~ Uy Pe 

Ttis clear that when the second-order sufficient condition (12.32) is satisfied, the second 
derivative in (12.33’) is positive, and the relevant indifference curve is strictly convex at the 
point of tangency. In the present context, it is also true that the strict convexity of the indif- 
ference curve at the tangency implies the satisfaction of the sufficient condition (12.32), 
This is because, given that the indifference curves are negatively sloped, with no stationary 
points anywhere, the possibility of a zcre dy /dx” value on a strictly convex curve is ruled 
out. Thus strict convexity can now result only in a positive d?y/dx°, and hence a positive 
1A], by (12.339. 

Recall, however, that the derivatives in |/7| are to be evaluated at the critical values x* 
and y* only. Thus the strict convexity of the indifference curve, as a sufficient condition, 
pertains only o the point of tangency, and it is not inconceivable for the curve to contain 
a concave segment away from point £, as illustrated by the broken curve segment in 
Fig. 12.7a. On the other hand, if the utility function is known to be a smooth, increasing, 
strictly quasiconcave function, then every indifference curve will be everywhere strictly 
convex. Such a utility function has a surface like the one in Fig. 12.44. When such a surface 
is cut with a plane parallel to the xy plane, we obtain for each of such cuts a cross section 
which, when projected onto the xy plane, becomes a strictly convex, downward-sloping 
indifference curve. In that cvent, no matter where the point of tangency may occur, the 
second-order sufficient condition will always be satisficd. Besides, there can exist only 
one point of tangency, one that yields the unique absolute maximum level of utility attain- 
able on the given linear budget. This result, of course, conforms perfectly to what the 
diamond on the right of Fig. 12.6 states, 

You have been repeatedly reminded that the second-order sufficient condition is not nec- 
essary. Let us illustrate here the maximization of utility while (12.32) fails to hold. Suppose 
that, as illustrated in Fig. 12.76, the relevant indifference curve contains a linear segment 
that coincides with a portion of the budget line. Then clearly we have multiple maxima, 
since the first-order condition U,/U, = P,/P, is now salisficd at every point on the linear 
segment of the indifference curve, including £), £2, and Ey. In fact, these are absolute 
constrained maxima, But since on a line segment d?y/dx? is zero, we have {A'| = 0 by 
(12,33’), Thus maximization is achieved in this case even though the second-order sulli- 
cient condition (12.32) is violated. 

The fact that a linear segment appears on the indifference curve suggests the presence 
of a slanted plane segment on the utility surface. This occurs when the utility function is 





378 Part Four Optimization Problems 


explicitly quasiconcave rather than strictly quasiconcave. As Fig, 12.76 shows, points £,, 
E, and £3, all of which arc located on the same (highest attainable) indifference curve, 
yield the same absolute maximum utility under the given linear budget constraint. Refer- 
ring to Fig. 12.6 again, we note that this result is perfectly consistent with the message 
conveyed by the diamond on the left. 


Comparative-Static Analysis 
In our consumer model, the prices P, and P, are exogenous, as is the amount of the bud- 
gct B. [f we assume the satisfaction of the second-order sufficient condition, we can analyze 
the comparative-static properties of the model on the basis of the first-order condition 
(12.31), viewed as a set of equations F/ = 0(j = 1, 2,3), where each F/ function has 
continuous partial derivatives. As pointed out in (12.19), the endogenous-variable Jacobian 
of this set of equations must have the same value as the bordered Hessian; that is, 
|4| = |]. Thus, when the second-order condition (12.32) is met, || must be positive and 
it does not vanish at the initial optimum. Consequently, the implicit-function theorem is 
applicable, and we may express the optimal values of the endogenous variables as implicit 
functions of the exogenous variables: 
= M(Pe, Py, BY 
x" = x"( Py, Py, B) (12.35) 
ys y"(P, Py, B) 
These are known to possess continuous derivatives that give comparative-static informa- 
tion. In particular, the derivatives of the last two functions x* and y*, which are descriptive 
of the consumer's demand behavior, can tell us how the consumer will react to changes in 
prices and in the budget. To find these derivatives, however, we must first convert (12.31) 
into a set of equilibrium identities as follows: 
B-x"P,-3 
Ue, ya PS 
Oy(0", y*) — AF =O 
By taking the total differential of each identity in turn (allowing every variable to change), 
and noting that U,, = U,,, we then arrive at the lincar system 
— Py ax®— Py, dv =x"dP. ty"dP, —aB 
Py dd + Un dx" + Uy dy" = MAP, (12.37) 
—P, di + Uy dx* + Uy dy" = MAP, 

To study the effect of a change in the budget size (also referred to as the iacome of the 
consumer), let dP, = dP, = 0, but keep dB # 0. Then, after dividing (12.37) through by 
dB, and interpreting each ratio of differentials as a partial derivative, we can write the 
matrix equation’ 





(12,36) 


Q -P, -P, (d4*/4B) -1 
=P, Us Uy, | | (ax*/9B) |=] 0 (12.38) 
=P, Use Usy | | (0p /08) 0 


+ The matrix equation (12.38) can also be obtained by totally differentiating (12.36) with respect to 
8, while bearing in mind the implicit solutions in (12,35). 


Chapter 12 Optimization with Equality Constraints 379 


As you can verify, the array of elements in the coefficient matrix is exactly the same as 
what would appear in the Jacobian |./|, which has the same value as the bordered II 
1A] although the latter has ®, and P, (rather than —P, and —P,} in the first tow and the 
first column, By Cramer's rule, we can solve for all three comparative-static derivatives, but 
we shall confine our attention to the following two: 











, 0 -1 -P 

ua) 1 y L|-P. ¥, 
—)=—|-7, 0 UJ=o] oe 7) (1239) 
aB J x aa lv ZF Uy 

Mife 9 otf Mil-% Us 

. 0 -P, = 

by* 1 x 1|-P, Osy 
*\)=—|-p vu. of=—| 2 27 «240 
(3) mal -p, nO ao? u,,| 12-40) 


By the second-order condition, |J| = || is positive, as are P, and P,. Unfortunately. in 
the absence of additional information about the relative magnitudes of P,, P,, and the Uj;, 
we are still unable to ascertain the signs of these two comparative-static derivatives, This 
means that, as the consumer's budget (or income) increases, his or her optimal purchases 
x* and y* may either increase or decrease. In case, say, x* decreases as B increases, prod- 
uct x is referred to as an inferior good as against a normal good. 

Next, we may analyze the eflect of a change in P,. Letting dP, = dB = 6 this time, but 
keeping dP, #0, and then dividing (12.37) through by dP., we obtain another matrix 
equation: 





0 -P, —P, ][(aatsar,) x 
=P Uy Uy || (artfary | =| ar (12.41) 
—P, Uy yy | | (Bv"/8 Py) 0 


From this, the following comparative-static derivatives emerge: 


at) 1 f Oo Py 
( ) Gla BO Ue 





























We Wile 9 uy 
Loe |-P Ug], ¥ | oO -7 
WI [Pe Grp | —Pe Uy 
=T%+% — [7; means the ith term] (12,42) 
(#)-5 ra 
OP, tly } U.. 0 
a= |-P mf 0 -F 
WL ;|-Py Uys WI} -Pe Ub 
=h+h (12.43) 


How do we interpret these two results? The first one, (dx*/2P,), tells how a change in 
P, affects the optimal purchase of x; it thus provides the basis for the study of our con- 
sumer’s demand function for x. There are two component terms in this effect. The first term, 
T, can be rewritten, by using (12.39), as —(8x*/dB)x*. In this light, 7; seems to be a 


380 Part Four Optimization Problems 


measure of the effect of a change in B (budget, or income) upon the optimal purchase x*, 
with x* itself serving as a weighting factor. However, since this derivative obviously is con- 
cerned with a price change, 7; must be imerpreted as the income effect of a price change. 
As P, rises, the decline in the consumer’s real income will produce an effect on x* similar 
to that of an actual decrease in B; hence the use of the term —(@x*/3B). Understandably, 
the more prominent the place of commodity x in the total budget, the greater this income 
effect will be—and hence the appearance of the weighting factor x* in 7). This interpreta- 
tion can be demonstrated more formally by expressing the consumer’s effective income 
loss by the differential dB = —x*dP,. Then we have 


dB 
+= —-—_ 12,44) 
“= OP, (12.44) 


and he Ox* ae ax*\ dB 
‘= Von)" > \ap/) ap, 


which shows 7; to be the measure of the effect of dP, on x* via B, that is, the income 
effect. 

If we now compensate the consumer for the effective income loss by a cash payment 
numerically equal to dB, then, because of the neutralization of the income effect, the 
remaining component in the comparative-static derivative (8x*/0P,), namely, 7, will 
measure the change in x* due entirely to price-induced substitution of one commodity for 
another, i.e., the substitution effect of the change in P,. To see this mare clearly, let us re- 
turn to (12.37), and consider how the mcome compensation will modify the situation. 
When studying the effect of dP, only (with dP, = d# = 0), the first equation in (12.37) 
can be written as —P, dx* — Py dy* =x" dP,. Since the indication of the effective income 
loss to the consumer lies in the expression x*¢P, (which, incidentally, appears only in the 
first equation), to compensate the consumer means to set this term equal to zero. If so, the 

















x* 0 
vector of constants in (12.41) must be changed from | 4* | to | 4* |, and the income- 
0 0 
compensated version of the derivative (0x*/4P,) will be 
0 0 -P 
dx* IL y ae - 
PP emgawacs Wp ‘gut | Mil-Pe Uw 


Hence, we may express (12.42) in the form 


om) (3) (=) . 

ON entha- xt (12.42) 

(Fr % aB AP. J compensated 
KS ae 


income effect substitution effect 








This result, which decomposes the comparative-static derivative (8x*/@P,) into two com- 
ponents, an income effect and a substitution effect, is the two-good version of the so-called 
Slutsky cquation. 

What can we say about the sign of (@x*/9P,)? The substitution effect 7; is clearly neg- 
ative, because |/| > 0 and A* > 0 [see (12.31’)]. The income effect 7, on the other hand, 
is indeterminate in sign according to (12.39). Should it be negative, it would reinforce 7); 
in that event, an increase in P, must decrease the purchase of x, and the demand curve of 


Chapter 12 Optimization with Equality Constraints 381 


the utility-maximizing consumer would be negatively sloped. Should it be positive, but rel- 
atively small in magnitude, it would dilute the substitution effect, though the overall result 
would still be a downward-sloping demand curve. But in case 7; is positive and dominates 
T (such as when x* is a significant item in the consumer budget, thus providing an over- 
whelming weighting factor}, then a rise in P, will actually Icad to a darger purchase of x, 
a special demand situation characteristic of what are called Giffen goods. Normally, of 
course, we would expect (0x*/aP,) to be negative. 

Finally, let us examine the compurative-static derivative in (12.43), (dv*/dP,) = 
13 + Tj, which has to do with the cross effect of a change in the price of x on the optimal 
purchase of y. The term 73 bears a striking resemblance to term T; and again has the inter- 
pretation of an income effect.’ Note that the weighting factor here is again x* (rather than 
y*); this is because we are studying the effect of a change in P, on effective income, which 
depends for its magnitude upon the relative importance of x* (not y*) in the consumer 
budget. Naturally, the remaining term, 7, is again a measure of the substitution effect. 

The sign of 73 is, according to (12.40), dependent on such factors as U,... Uys, cte., and 
is indeterminate without further restrictions on the model. However, the substitution effect 
T, will surely be positive in our model, since 4", P,, Py and || are all positive. This means 
that, unless more than offset by a negative income effect, an increase in the price of x will 
always increase the purchase of y in our two-commodity model, In other words, in the con- 
text of the present model, where the consumer can choose only between two goods, these 
goods must bear a relationship to cach other as substitutes. 

Even though the preceding analysis relates to the effects of a change in P,, our results 
are readily adaptable to the case of a change in P,. Our model happens to be such that 
the positions occupied by the variables x and y are perfectly symmetrical. Thus, to infer the 
etfects of a change in P,, all that it takes is to interchange the roles of x and y in the results 
already obtained. 


Proportionate Changes in Prices and Income 
It is also of interest to ask how x* and y* will be affected when all three parameters P,. P,. 
and B are changed in the same proportion. Such a question still lies within the realm of 
comparative statics, but unlike the preceding analysis, the present inquiry now involves the 
simultaneous change of all the parameters. 

When both prices are raised, along with income, by the same multiple j, every term in 
the budget constraint will increase j-fold, to become 

[B— jxP.— jyP,=0 

Inasmuch as the common factot j can be canceled out, however, this new constraint is in fact 
identical with the old. The utility function, moreover, is independent of these parameters. 
Consequently, the old equilibrium levels of x and y will continuc to prevail; that is, the con- 
sumer equilibrium position in our model is invariant to equal proportionate changes in all 
the prices and in the income. Thus, in the present model, the consumer is seen to be free 
from any “money illusion.” 


t Hfyou need a stronger dose of assurance that 73 represents the incorne effect, you can use (12.40) 


and (12.44) to write 
fe) ae FAV) 98 
Tae (Ge) “(5 


Thus 73 is the effect of a change in P, on y* via the income factor B. 





382 Part Four Optimization Problems 


Symbolically, this situation can be described by the equations 


x°(Py, Py, BS a"UPe, JP r, JB) 
¥°(Pe Poy BY = Pn FP ys FB) 


The functions x* and y*, with the éavariance property just cited, are no ordinary functions; 
they are examples of a special class of function known as #eomogeneous fimetians, which 
have interesting economic applications. We shall therefore examine these in Sec. 12.6. 





EXERCISE 12.5 
1. 


Given U = (x + 2)(y+1) and P, = 4, Py = 6, and 8 = 130: 

(@) Write the Lagrangian function, 

{b) Find the optimal levels of purchase x* and y*- 

(O Is the second-order sufficient condition for maximum satisfied? 
(d) Does the answer in (4) give any comparative-static information? 


. Assume that U = (x-+2)(y +1), but this time assign no specific numerical values to 


the price and income parameters. 

(a) Write the Lagrangian function. 

(b) Find x*, y*, and 2* in terms of the parameters P,, P,, and B. 

(Q Check the second-order sufficient condition for maximum. 

(d) By setting P, =4, Py =6, and B = 130, check the validity of your answer to 
Prob. 1. 


. Can your solution (x* and y*) in Prob. 2 yield any comparative-static information? Find 


all the comparative-static derivatives you can, evaluate their signs, and interpret their 
economic meanings. 


. From the utility function U = (x+2)(y+1) and the constraint xP, + yP,= 8 of 


Prob. 2, we have already found the U,; and 177], as well as x* and 4”, Moreover, we 
recall that {J | =|]. 

(a) Substitute these into (12,39) and (12,40) to find (ax* /0B) and (ay*/aB). 

(b) Substitute into (12.42) and (12.43) to find (3x7 /4P,) and (ay*/IP,). 

Do these results check with those obtained in Prob. 3? 


. Comment on the validity of the statement: “If the derivative (4x"/dP,) is negative, 


then x cannot possibly represent an interior good.” 


. When studying the effect of dP, alone, the first equation in (12.37) reduces to 


— Py dx* — Py dy* = x*dP,, and when we compensate for the consumer's effective in- 
come foss by dropping the term x*dP,, the equation becomes ~?, dx’ — Pydy* = 0. 
Show that this last result can be obtained alternatively from a compensation pracedure 
whereby we try to keep the consumer's optimal utility level U* (rather than effective 
income) unchanged, so that the term 7) can alternatively be interpreted as 
(8x"/4Px)uscconstant. (Hint: Make use of (12.31").] 


. (a) Does the assumption of diminishing marginal utility to goods x and y imply strictly 


convex indifference curves? 
(b) Does the assumption of strict convexity inthe indifference curves imply diminish- 
ing marginal utility to goods x and y? 


Chapter 12 Optimization with Equality Constrainty 383 


12.6 Homogeneous Functions 





Example 1 


Example 2 


Example 3 


A function is said to be homogencous of degree v, if multiplication of each of its indepen- 
dent variables by a constant / will alter the value of the function by the proportion ;”, that 
is, if 

FG Mss Pn) = FF, Xn) 


In general, j can take any value. However, in order for the preceding equation to make 
sense, (/x1,..., /%2) must not lie outside the domain of the function #- For this reason, in 
economic applications the constant is usually taken to be positive, as most economic vari- 
ables do not admit negative values, 


Given the function f(x, y, w) = x/y + 2w/3x, if we multiply each variable by j, we get 


Gx) | 2Gw) x lw 
AUX, iy, IW + =- +5 = Fon) = Pa yw) 
indy Im) =o 3G) Ty On yew) = PEO y, 
in this particular example, the value of the function will not be affected at all by equal pro- 
portionate changes in all the independent variables; or, one might say, the value of the 
function is changed by a multiple of /° (= 1). This makes the function f a homogeneous 
function of degree zero, 


You will observe that the functions x* and y* cited at the end of Sec. 12.5 are both 
homogeneous of degree zero. 


When we multiply each variable in the function 





(x, wat. 
WX ¥, yt 
by j, we get 
oa Gia? wr (= a 
90% fy, iW) = yt Ga TI ta = 19x v4) 


The function g is homogeneous of degree one (or, of the first degree); multiplication of 
each variable by j will alter the value of the function exactly /-fold as well. 


Now, consider the function A(x, y, w) = 2x24 3yw — w?. A similar multiplication this time 
will give us 


ACER, fy, JW) = 2px)? + 3Giw) — (jw)? = [PAC y, w) 


Thus the function his homogeneous of degree two; in this case, a doubling of all variables, 
for example, will quadruple the value of the function. 


Linear Homogeneity 

In the discussion of production functions, wide use is made of homogeneous functions of 
the first degree. These are often relerred to as finearly homogeneous functions, the adverb 
linearly modifying the adjective homogeneous. Some writers, however, sccm to prefer 
the somewhat misleading terminology finear homogeneous functions, or even linear and 


384 Part Four Optimization Problems 


homogencous functions, which tends to convey, wrongly, the impression that the functions 
themselves are linear. On the basis of the function g in Example 2, we know that a function 
which is homogeneous of the first degree is not necessarily lincar in itsclf. Hence you 
should avoid using the terms “linear homogeneous functions” and “linear and homoge- 
neous functions” unless, of course, the functions in question are indeed linear. Note, how- 
ever, that it is not incorrect to speak of “linear homogeneity,” meaning homogeneity of 
degree one, because to modify a noun (homogeneity) does call for the use of an adjective 
(linear). 

Since the primary field of application of linearly homogencous functions is in the theory 
of production, Ict us adopt as the framework of our discussion a production function in the 
form, say, 


O= f(K,L) (12.45) 


Whether applied at the micro or the macro level, the mathematical assumption of linear ho- 
mogeneity would amount to the economic assumption of constant returns to scale, because 
linear homogeneity means that raising all inputs (independent variables) j-fold will always 
raise the output (value of the function) exactly j-fold also. 

What unique properties characterize this linearly homogeneous production function? 


Property I Given the lincarly homogeneous production function Q = f(K, L), the aver- 
age physical product of labor (APP,} and of capital (APP, } can be expressed as functions 
of the capitaltabor ratio, A = K/L, alone. 


To prove this, we multiply each independent variable in (12.45) by a factor j = 1/L. By 
virtue of linear homogeneity, this will change the output from Q to jQ = O/L. The right 
side of (12.45) will correspondingly become 


iF. ;) = (FZ ') = fk) 


Since the variables K and L in the original function are to be replaced (whenever they 
appear) by & and I, respectively, the right side in effect becomes a function of the 
capital-labor ratio k alone, say, @(4), which is a function with a single argument, k, even 
though two independent variables K and Z are actually involved in that argument. Fquating 
the two sides, we have 


ape, = © = 60h (12.46) 
The expression for APPx is then found to be 
_2 QL _ ott) 
APPKE ETE (12.47) 


Since both average products depend on X alonc, linear homogeneity implies that, as long 
as the K/L ratio is kept constant (whatever the absolute levels of K and £), the average 
products will be constant, too. Therefore, while the production function is homogeneous of 
degree one, both APP, and APP, are homogeneous of degree zero in the variables K and 
L, since equal proportionate changes in K and L (maintaining a constant &) will not alter the 
magnitudes of the average products, 


Chapter 12 Optimization with Equadity Constraints 385 


Property 1 Given a linearly homogeneous production function Q = f(K, L), the mar- 
ginal physical products MPP, and MPP, can be expressed as functions of & alone. 


To find the marginal products, we first write the total product as 
QG=LE(k) [by (12.46)] (12.45’) 


and then differentiate Q with respect to X and L. For this purpose, we shall find the follow- 
ing two preliminary results to be of service: 


Ok a K 1 ak afk -K 
meni ()"t rn" B (12.48) 


The results of differentiation are 














a2 _ 9 
MPPx = aK aK [Lg(k}] 
= ie = ca) x [chain rule] 
1 
= Leh) (;) =o [by(12.48)] (12.49) 
_ag_a 
MPP, = == = 5 (L@th)] 
=k) +L we [product rule] 
= 940) + LOTS chain ue 
tay 7K 
= P(k) + Lo'(k) ER [by (12.48)] 
= o(k) — kg'(h) (12.50) 


which indeed show that MPPx and MPP, are functions of & alone. 
Like average products, the marginal products will remain the same as long as the 


capital -labar ratio is held constant, they arc homogeneous of degree zero in the variables K 
and L. 


Property If (Euler’s theorem) If @ = f(, L) is linearly homogeneous, then 


Cray 80 
Ke + LT =0 
PRooF 
a 
xe + use = Ko'(k) + Llock) —4¢'(A)] by (12.49), (12.50)] 


=K¢OUO)+LO)- Ke) [k= K/L] 
= Lok) = 2 [by (12.459] 
Note that this result is valid for any values of K and L; this is why the property can be 


written as an identical equality, What this property says is that the value of a linearly 
homogeneous function can always be expressed as a sum of terms, each of which is the 


386 Part Four Optimization Problems 


product of one of the independent variables and the first-order partial derivative with 
Tespect to that variable, regardless of the levels of the two inputs actually employed. Be 








a 
careful, however, to distinguish between the identity K ae +45 2 = Q [Euler's theorem, 
which ws only to the constant-returns-to-scale case of @ = f(K, L)] and the equation 
ag 


dQ= weak + aie {total differential of Q, for any function Q = f{K, L)]. 


Economically ths property means that under conditions of constant returns to scale, if 
each input factor is paid the amount of its marginal product, the total product will be 
exactly exhausted by the distributive shares for all the input factors, or, equivalently, the 
pure economic profit will be zero. Since this situation is descriptive of the long-run cqui- 
librium under pure competition, it was once thought that only linearly homogeneous pro- 
duction functions would make sense in economics. This, of course, is not the case. The zero 
economic profit in the long-run equilibrium is brought about by the forces of competition 
through the entry and exit of firtns, regardless of the specific nature of the production func- 
tions actually prevailing, Thus it is not mandatory to have a production function that 
ensures product exhaustion for any and all (X, £) pairs. Moreover, when imperfect compe- 
tition exists in the factor markets, the remuneration lo the factors may not be equal to the 
marginal products, and, consequently, Euler's theorem becomes irrelevant to the distribu- 
tion picture. However, linearly homogeneous production functions are oficn convenient to 
work with because of the various nice mathematical properties they ate known to possess. 


Cobb-Douglas Production Function 
One specific production function widely used in economic analysis (earlier cited in 
Sec. 11.6, Example 5) is the Cobb-Douglas production function: 


QO = AK*L'* (12.51) 


where A is a positive constant, and is a positive fraction. What we shall consider here first 
is a generalized version of this function, namely, 


Q=AK*L? (12.52) 


where £ is another positive fraction which may ot may not be equal to 1 — w. Some of the 
major features of this function are: (1) it is homogeneous of degree (a + 8): (2) in the spe- 
cial case of a + 6 = 1, itis linearly homogencous; (3) its isoquants arc negatively sloped 
throughout and strictly convex for positive values of K and L; and (4) it is strictly quasi- 
concave for positive K and L. 

Its homogeneity is easily seen from the fact that, by changing K and £ to jK and jl, 
respectively, the output will be changed ta 


AG CLD = fP(ARAL?) = FPO 


That is, the function is homogeneous of degree (a + 8). In case a + £ = 1, there will be 
constant returns to scale, because the function will be linearly homogencous. (Note, how- 
ever, that this function is no# linear! It would thus be confusing to refer to it as a “linear 
homogencous” or “linear and homogeneous” function.) That its isoquants have negative 
slopes and strict convexity can be verified from the signs of the derivatives dK /dL and 


Chapter 12 Optimization with Equality Constraints 387 


aK aL? (or the signs of dL /dK and d?L/dK*). For any positive output Qo, (12.52) 
can be written as 


AK*LF= 0) (A, KL, 00> 0) 
Taking the natural log of both sides and transposing, we find that 
Ind+alnK +f Inf —InQy=0 


which implicitly defines K as a function of L.* By the implicit-function rule and the log 
rule, therefore, we have 
dK aF/aL __(B/L) __ BK 


= <= = <0 
aL oF /aK (@/K) wh 





Then it follows that 


2 

eK d (S)-- d (F)-— 1 (cE -«)>0 

di? aL ab adL\Lb al? di 
The signs of these derivatives establish the isoquant (any isoquant) to be downward-sloping 
throughout and strictly convex in the LX plane for positive values of K and L. This, of 
course, is only to be expected from a function that is strictly quasiconcave for positive K 
and L. For the strict quasiconcavity feature of this function, sec Example 5 of Sec. 12.4, 
where a similar function was discussed. 

Let us now examine the o + 6 = 1 case (the Cobb-Douglas function proper), to verify 

the three properties of linear homogencity cited earlier. First of all, the total product in this 
special case is expressible as 





« 
Q=AK*L'* = 4 (F) LaLAe (12.51) 


where the expression A4* is a specific version of the general expression (/) used before. 
Therefore, the average products arc 





APP, = e = Ak 
(12.53) 
Q_QL_ Ak -1 
APPp => == = = Ak® 
an oar 
both of which are now functions of k alone. 
Second, differentiation of Q = AK*L!~* yields the marginal products: 
4 al 
se = Adak TL) = da (=) = Aak*! 
a (12.54) 
ag AK“ -a@)L™ = A(1 a) RY A(l —a)k? 
aL L 


and these are also functions of & alone. 


' The conditions of the implicit-function theorem are satisfied, because F (the left-side expression) has 
continuous partial derivatives, and because af /9K =a/K #0 for positive values of K. 


388 Part Four Optimization Problems 


Last, we can verify Buler’s theorem by using (12.54) as follows: 


x22, 122 = KAagk'+ LA -— ay" 


OK ab 
= cane (B24 
= Te tt 
=LABa+1—a)=Lde =O fby (1251) 


Interesting economic meanings can be assigned to the exponents « and (1 — @) in the 
linearly homogencous Cobb-Douglas production function. If each input is assumed to be 
paid by the amount of its marginal product, the relative share of total product accruing to 
capital will be 


K(aQ/aK) — KAake 
OQ LA 
Similarly, labor's relative share will be 


LOQiOL) _ LAU =a 
OQ. Lae 7% 


Thus the exponent of each input variable indicates the relative share of that input in the 
total product. Looking at it another way, we can also interpret the exponent of cach input 
variable as the partial elasticity of output with respect to that input. This is because the 


capital-share expression just given is equivalent to the expression = €gx and, 


O/K 
similarly, the labor-share expression just given is precisely that of ég,.. ' 
What about the meaning of the constant 4? For given values of K and L, the magnitude 
of A will proportionately affect the level of Q. Hence 4 may be considered as an efficiency 
parameter, i.¢,, a8 an indicator of the state of technology. 


Extensions of the Results 
We have discussed linear homogeneity in the specific context of production functions, but 
the properties cited are equally valid in other contexts, provided the variables K, L, and Q 
are properly reinterpreted. 

Furthermore it is possible to extend our results to the case of more than two variables. 
With a linearly homogeneous function 


VS f(X, 22,1 Xa) 
we can again divide each variable by 2 (that is, multiply by f /x1) and get the result 


yan : ott) (homogeneity of degree 1] 
Hy OX x 





which is comparable to (12.45’). Morcover, Euler’s theorem is casily extended to the form 


, 
Vout 
i=l 








y (Euler’s theorem] 


Chapter 12. Optimization with Equality Constraints 389 


where the partial derivatives of the original function / (namely, 7;) are again homogencous 
of degree zero in the variables x;, as in the two-variable case, 

The preceding extensions can, in fact, also be gencralized with relative ease to a homo- 
geneous function of degree r. In the first place, by definition of homogeneity, we can in the 
present case write 


xy Xt: Xx, . . 
y=xie (2 3. sees =) {homogeneity of degree r] 
Xow xy 


The modified version of Eulet’s theorem will now appear in the form 
a 


Y Xi Ais ry (Euler’s theorem] 

j=l 
where a multiplicative constant r bas been attached to the dependent variable y on the right. 
And, finally, the partial derivatives of the original function f, the f;, will all be homoge- 
neous of degree (r — 1) in the variables x;. You can thus see that the linear-homogeneity 
case is merely a special case thereof, in which r = 1. 











EXERCISE 12.6 
1. Determine whether the following functions are homogeneous. If so, of what degree? 
@ fy N= Py (d) fx, Y= 2x yt 87 
©) fxry= (2 = Fy? @ fy w= we + axw 
© fx PeP-ayty GQ, y, ) = x4 — Syn L 
2. Show that the function (12.45) can be expressed alternatively as Q = Ky (i) instead 
of Q= 1d (=) . 


3. Deduce from Euler's theorem that, with constant returns to scale: 
{a) When MPPx = 0, APP, is equal to MPP;. 
(b) When MPP; = 0, APPx is equal to MPPx. 
4. On the basis of (12.46) through (12.50), check whether the following are true under 
conditions of constant returns to scale: 
(a) An APP; curve can be plotted against k (= K/L) as the independent variable (on 
the horizontal axis). 
(b) MPP is measured by the slope of that APP: curve. 
(c) APPx is measured by the slope of the radius vector to the APP, curve. 
(d) MPP; = APP, — k(MPPx) = APP, — k (slope of APP; ). 
5. Use (12.53) and (12.54) to verify that the relations described in Prob. 4b, c, and dare 
obeyed by the Cobb-Douglas production function. 
6. Given the production function Q = AK*L4, show that: 
(a) @+ f > 1 implies increasing returns to scale. 
{b) a + 8 < 1 implies decreasing returns to scale. 
(Q a and £ are, respectively, the partial elasticities of output with respect to the capital 
and labor inputs. 


390 Part Four Optintization Problems 


7. Let output be a function of three inputs: Q= AK °L°NC, 
{a} Is this function homogeneous? If so, of what degree? 
(b) Under what condition would there be constant returns to scale? Increasing returns 
to scale? 
(o Find the share of product for input A, if it is paid by the amount of its marginal 
product. 
8. Let the production function Q = g(X, L) be homogeneous of degree 2. 
(a) Write an equation to express the second-degree homogeneity property of this 
function. 
(B) Find an expression for Qin terms of (4), in-the-vein of (12.45"). 
(Q Find the MPPx function. Is MPP, still a function of k alone, as in the linear- 
homogeneity case? 
{d) is the MPP, function homogeneous in K and-t? If so, of what degree? 


12.7 _Least-Cost Combination of Inputs 





As another example of constrained optimization, let us discuss the problem of finding the 
least-cost input combination for the production of a specified level of output Qy represent- 
ing, say, a customer's special order. Here we shall work with a general production function; 
later on, however, reference will be made to homogeneous production functions. 





First-Order Condition 
Assuming a smooth production function with two variable inputs, Q = O(a, b), where 
Quy, Op > G, and assuming both input prices ta be exogenous (though again omitting the 
zero subscript), we may formulate the problem as one of minimizing the cost 
C= aP, + bP, 
subject to the output constraint 
O(a, b) = Qo 
Hence, the Lagrangian function is 
Zea, + bPy+ wlOn — Oa, 6)] 
To satisfy the first-order condition for a minimum C, the input levels (the choice vari- 
ables) must satisfy the following simultaneous equations: 
Z, = Oo — Ola, b) =9 
2, = Pi- 10. =0 
Zp= Py - 10, = 9 
The first equation in this set is merely the constraint restated, and the last two imply the 
condition 


Po Ph 
= SH 12.55 
Oe t ) 


FIGURE 12.8 


Chapter 12 Opiimization with Equatity Constraints 391 


b 


Isoquant (Q = Q,) 



















At the point of optimal input combination, the input-price-marginal-product ratio must be 
the same for cach input. Since this ratio measures the amount of outlay per unit of marginal 
product of the input in question, the Lagrange multiplier can be given the interpretation of 
the marginal cost of production in the optimum state. This interpretation is, of course, en- 
tirely consistent with our carlier discovery in (12.16) that the optimal value of the Lagrange 
multiplier measures the comparative-static effect of the constraint constant on the optimal 
value of the objective function, that is, 4* = (§C*/§Qo), where the § symbol indicates that 
this is a partial total derivative, 
Equation (12.55) can be alternatively written in the form 


Py _ Qo (12.55') 


Pp, Ob 

which you should compare with (12.31), Presented in this form, the first-order condition 
can be explained in terms of isoquants and isocosts. As we learned in (11.36), the 0../Qs 
ratio is the negative of the slope of an isoquant; that is, it is a measure of the marginal rate 
of technical substitution of a for b (MRTS,,). In the present model, the output level is spec- 
ified at Qo; thus only one isoquant is involved, as shown in Fig. 12.8, with a negative slope. 

The P, / Ps ratio, on the other hand, represents the negative of the slope of isocosts (a no- 
tion comparable with the budget line in consumer theory). An isocost, defined as the locus 
of the input combinations that entail the same total cost, is expressible by the equation 

Co OF 
Cy =aP, + bP, or b P, Po 

where Cy stands for a (parametric) cost figure. When plotted in the ab plane, as in Fig. 12.8, 
therefore, it yields a family of straight lines with (negative) slope —P,,/P, (and vertical 
intercept Co/P,). The equality of the two ratios therefore amounts to the equality of the 
slopes of the isoquant and a selected isocost. Since we are compelled to stay on the given 
isoquant, this condition leads us to the point of tangency £ and the input combination 
(at, b*). 


392 Part Four Optimézation Problems 


Second-Order Condition 
To ensure a minimum cost, it is sufficient (after the first-order condition is met} to have a 
negative bordered Hessian, i.c., to have 


_ 9 Qu 0» 
\A] =|Q. -HQca -#Qun| = 2( Qa Oi ~ 20arQu Os + OnQ;) <0 
Or —# Ot —#Oe» 
Since the optimal value of 4 (marginal cost) is positive, this reduces to the condition that 
the expression in parentheses be negative when evaluated at £. 
From (11.44), we recall that the curvature of an isoquant is represented by the second 
derivative 


QaQh — 2Q200a0n + 69?) 





in which the same parenthetical expression appears. Inasmuch as Q;, is positive, the satis- 
faction of the second-order sufficient condition would imply that d?b/da? is positive—that 
is, the isoquant is strictly convex --at the point of tangency. In the present context, the strict 
convexity of the isoquant would also imply the satisfaction of the second-order sufficient 
condition. For, since the isoquant is negatively sloped, strict convexity can mean only a pos- 
itive d*b/da? (zero d?b/da? is possible only at a stationary point on the isoquant), which 
would in tym ensure that |H| < 0. Howcver, it should again be borne in mind that the suf 
ficient condition || < 0 (and hence the strict convexity of the isoquant) at the tangency is, 
per sc, pot necessary for the minimization of C. Specifically, C can be minimized even 
when the isoquant is (nonstrictly) convex, in a multiple-minimum situation analogous to 
Fig, 12.7), with d?b/da? = 0 and |/7| = 0 at each minimum. 

In discussing the utility-maximization model (Sec. 12.5), it was pointed out that a 
smooth, increasing, strictly quasiconcave utility function U = U(x, y) gives rise to every- 
where strictly convex, downward-sloping indifference curves in the xy plane. Since the 
notion of isoquants is almost identical with that of indifference curves,” we can reason by 
analogy that a smooth, increasing, strictly quasiconcave production function 2 = Q(a, 6) 
can generate everywhere strictly convex, downward-sloping isoquants in the ab plane. If 
such a production function is assumed, then obviously the second-order sufficient cundi- 
tion will always be satisfied. Moreover, it should be clear that the resulting C* will be a 
unique absolute constrained minimum. 


The Expansion Path 

Let us now turn to one of the comparative-static aspects of this model. Assuming a fixed 
ratio of the two input prices, fet us postulate successive increases of Qy (ascent to higher 
and higher isoquants) and trace the effect on the least-cost combination 6*/a*. Each shift 
of the isoquant, of course, will result in a new point of tangency, with a higher isocost, The 
locus of such points of tangency, known as the expansion path of the firm, serves to de- 
scribe the least-cost combinations required to produce varying levels of Qy. Two possible 
shapes of the expansion path are shown in Fig. 12.9. 


+ Both are in the nature of “isovalue” curves. They differ only in the field of application; indifference 
curves are used in models of consumption, and isoquants, in models of production. 


FIGURE 12.9 


Chapter 12 Optimization with Equality Constrainis 393 





Expansion path 





Expansion path 











(a) {b) 


If we assume the strict convexity of the isoquants (hence, satisfaction of the second- 
order condition), the expansion path will be derivable directly from the first-order condition 
(12.55'), Let us illustrate this for the generalized version of the Cobb-Douglas production 
function. 

The condition (12.55’) requires the equality of the input-price ratio and the marginal- 
product ratio. For the function @ = 4a*b*, this means that each point on the expansion 
path must satsify 


P,Q — Aoat'bP ab 





Suet TL 12.56) 
Ph QO, Aatpbe-! Ba ( ) 
implying that the optimal input ratio should be 
bt BP, 
Oe Bee a constant (12.57) 
ae a P, 


since «, #, and the input prices are all constant. As a result, all points on the expansion path 
must show the same fixed input ratio; i.¢., the expansion path must be a straight line ema- 
nating from the point of origin. This is illustrated in Fig. 12.95, where the input ratios ai the 
various points of tangency (4£/O4A, A°E'/OA', and 4”E"/O.A") are all equal. 

The linearity of the expansion path is characterisite of the generalized Cobb-Douglas 
function whether or not a + f = 1, because the derivation of the result in (12.57) does not 
rely on the assumption a + § = 1. As a matter of fact, any homogeneous production func- 
tion (not necessarily the Cobb-Douglas} will give rise to a linear expansion path for each 
set of input prices, because of the following reason: if it is homogencous of (say) degree r, 
both marginal-product functions Q, and Q, must be homogencous of degree (r — 1) in the 
inputs @ and 6; thus a j-fold increase in both inputs will produce a j” '-fold change in 
the values of beth Q, and Q,, which will leave the Q,/ Qs ratio intact. Therefore, if the 
first-order condition P,/P, = Q./Qp is satisfied at given input prices by a particular input 
combination (ao, bo}, it must also be satisfied by a combination (jao, /b))}—precisely as is 
depicted by the linear expansion path in Fig. 12.92, 

Although any homogeneous production function can give rise to a linear expansion path, 
the specific degree of homogeneity does make a significant difference in the interpretation 


394 Part Four Optimization Problems 


of the expansion path. In Fig. 12.96, we have drawn the distance OE equal to that ££", so 
that point £’ involves a doubling of the scale of point £. Now if the production function is 
homogeneous of degree ane, the output at £’ must be twice (2! = 2) that of £. But if the 
degree of homogencity is svo, the output at £” will be four times (2? = 4) thal of £. Thus, 
the spacing of the isoquants for 0 = 1, Q =2,..., will be widely different for different 
degrees of homogeneity. 


Homothetie Functions 
We have explained that, given a set of input prices, homogeneity (of any degree) of the pro- 
duction, function produces a linear expansion path. But linear expansion paths are not 
unique to homogeneous production functions, for a more general class of functions, known 
as homothetic functions, can produce them, too. 

Homotheticity can arise from a composite function in the form 


H=h[Qla,b))  [h(Q) #0) (12.58) 


where Q(a, b) is homogeneous of degree r, Although derived from a homogeneous func- 
tion, the function H = H(a, 5) is in general not homogeneous in the variables a and 4. 
Nonetheless, the expansion paths of H(a, 6), like those of O(a, 4), are linear, The key to 
this result is that, at any given point in the ab plane, the # isoquant shares the same slope 
as the Q isoquant: 


. He A(Q)Oa 
Slope of H is t= —-S = 
Pe O Tsoquant A 7 0) Oh 
= a = slope of Q isoquant (12.59) 
b 


Now the linearity of the expansion paths of Q{a, >} implies, and is implied by, the 
condition 


Qa . _ b 
—= = constant for any given — 
i a 


In view of (12.59), however, we immediately have 


at 


He = constant for any given é (12.60) 
Hp a 
as well, And this establishes that H(a, 6) also produces lincar expansion paths. 

The concept of homotheticity is more general than that of homogeneity. In fact, every 
homogeneous function is automatically a member of the homothetic family, but a homo- 
thetic function may be a function outside the homogeneous family. The fact that a homo- 
geneous function is always homothetic can be seen from (12.58), where if we let the 
function H = A(Q) take the specific form H = Q—with h'(Q) =dH/dQ = 1. then 
the function Q, being identical with the function # itself, is obviously homothetic. That 
a homothetic function may not be homogeneous will be illustrated in Example 2, which 
follows, 

In defining the homothetic function H, we specified in (12.58) that 4'(Q) # 0. This en- 
ables us to avoid division by zero in (12.59). While the specification #'(Q) # 0 is the only 


Example 1 


Example 2 


Chapter 12 Optimization with Equality Constraints. 395 


requirement from the mathematical standpoint, cconomic considerations would suggest the 
stronger restriction A’(Q) > 0. For if H(a, b) is, like O(a, b), to serve as a production 
function, that is, if A is to denote output, then H, and H, should, respectively, be made to 
go in the same direction as @, and Q; in the O(a, b) function. Thus H(a, 5) needs to be 
restricted to be a monotonically increasing transformation of Q{a, b). 

Homothetic production functions {including the special case of homogeneous ones) 
possess the interesting property that the (partial) clasticity of optimal input level with 
respect to the output level is uniform for all inputs. To see this, recall that the linearity of 
expansion paths of homothetic functions means that the optimal input ratio )*/a” is unaf- 
fected by a change in the exogenous output level Ho. Thus d(6*/a*)/@H = 0 or 
1 ( ,0b* dat 

a -h— 
BH a 
Multiplying through by a*? Ho, and rearranging, we then get 
da* Hy ab* Hy 

aH) a" — dHy bt 





3 ) =0 — [quotient rule] 
e 








OF bar Hy = Earthy 


which is what we previously asserted. 


Let H = Q2, where Q = Aab!. Since Q(a, b) is homogeneous and f’(Q) = 20 is positive 
for positive output, H{a, 6) is homothetic for Q > 0. We shall verify that it satisfies (12.60). 
First, by substitution, we have 


H = Q? = (Aatbt)? = A2q?*p2h 
Thus the slape of the isoquants of H is expressed by 


Ho A*2aa’*"h2F ab 
Hy Aba®2pb26-1 ~~ pe (12.61) 


This result satisfies (12,60) and implies linear expansion paths. A comparison of (12.61) 
with (12.56) also shows that the function H satisfies (12.59). 

In this example, Q(a, b) is homegeneous of degree (a + 8). As it turns out, H(a, bis also 
homogeneous, but of degree 2(a + f). As a rule, however, a homothetic function is not 
necessarily homogeneous. 


Let H = e®, where Q= Aa“b". Since Q(a, b) is homogeneous and '(Q) = e? is positive, 
H(a, b) ts homothetic. From this function, 


H (a, b) = exp(Aa*b*) 
it is easily found that 
Ho Aaa*~'bhexp(Aatb’) ab 





“Hy Aa*?BbP Texp(AaebA) Ba 


This result is, of course, identical with (12.61) in Example 1, This time, however, the homo- 
thetic function is not homogeneous, because 


H(ja, jb) = explA(ja)" (jb) = exp(Aa®b# j7*) 
= [exp(Aat by)" = [H(a, By! # j"H(a, 8) 


soft 


396 Part Four Optimization Problems 


Elasticity of Substitution 

Another aspect of comparative statics has to do with the effect of a change in the P, / Py 
ratio upon the least-cost input combination 6*/a* for producing the same given output 
Qz (that is, while we stay on the same isoquant). 

When the (exogenous) input-price ratio P,/P, rises, we can normally expeet the opti- 
mal input ratio b*/a* also to rise, because input } {now relatively cheaper) will tend to be 
substituted for input a. The direction of substitution is clear, but what about its extent? The 
extent of input substitution can be measured by the following point-elasticity expression, 
called the elasticity of substitution and denoted by o (lowercase Greek letter sigma, for 
“substitution”): 

a(ptja*) — d(b*/a*) 
relative change in (}*/a*) b* ja* d(P,/ Ps) 
relative change in(P,/Py)  4(Pa/Pp») ~~ b* fa" 

Fu/Ps PL Py 
The value of @ can be anywhere between 0) and 09; the larger the o, the greater the substi- 
tutability between the two inputs. The limiting case of o = 0 is where the two inputs must 
be used in a fixed proportion as complements to each other. The other limiting case, with o 
infinite, is where the two inputs are perfect substitutes for cach other. Note that, if (b*/a*) 
is considered as a function of (P,,/ Pp), then the elasticity « will again be the ratio of a mar- 
ginal function to an average function," 

For illustration, Ict us calculate the elasticity of substitution for the generalized Cobb- 
Douglas production function. We fearned. earlier that, for this case, the least-cost input 
combination is specified by 


(=) = . (#) {from (12.57}] 


This equation is in the form y = ax, for which dy/dx (the marginal) and y/x (the average) 
are both equal to the constant a. That is, 


aria) Bo bat _B 





a 


(12.62) 





APo/Ps) oF PalPy 
Substituting these values into (12.62), we immediately find that ¢ = 1; that is, the general- 
ized Cobb-Douglas production function is characterized by a constant, unitary elasticity of 
substitution. Note that the derivation of this result in no way rclies upon the assumption that 
a + B = 1, Thus the elasticity of substitution of the production function Q = Aa bF will 
be unitary even ifa + f # |. 





+ There is an alternative way of expressing a. Since, at the point of tangency, we always have 
Pa _ Qo 
> = a = MRTSqp 
the elasticity of substitution can be defined equivalently as 
dit ja) tht at) 
__ felative change in (bt/a") _—bt/at_—_—d{Qu/Qp) 
= ‘elative change in MRTS,,  (Qo/Qu) —B/a* 
Qa/ Qe Qo/ Qs 








(12.62') 








Chapter 12 Optimization with Equality Constraints 397 


CES Production Function 

More recently, there has come into common use another form of production function 
which, while still characterized by a constant clasticity of substitution (CES), can yield as 
a with a (constant) value other than 1.' The cquation of this function, known as the CES 
production function, is 


O= ALEK’ +U-SL PY (A> 00<8<l-L<p¥0) (12.63) 


where K and L represent two factors of production, and 4, 4, and (lowercase Greek letter 
rho) are three parameters. The parameter A (the efficiency parameter) plays the same role 
as the coefficient 4 in the Cobb-Douglas function; it serves as an indicator of the state of 
technology. The parameter 4 (the distribution parameter), like the a in the Cobb-Douglas 
function, has to do with the relative factor shares in the product. And the parameter p (the 
substitution parameter)—which has no counterpart in the Cobb-Douglas function—is 
what determines the value of the (constant) elasticity of substitution, as will be shown later 
in this section. 

First, however, let us observe that this function is homogeneous of degree one. If we 
replace K and L by jX and jZ, respectively, the output will change from Q to 


ABU RY? + — OLY PT"? = AG PLBR? + (1 - LP? 
=)? = 70 


Consequently, the CES function, like all linearly homogeneous production functions, dis- 
plays constant returns to scale, qualifies for the application of Euler’s theorem, and pos- 
sesses average products and marginal products that are homogencous of degree zero in the 
variables K and L, 

We may also note that the isoquants generated by the CES production function are 
always negatively sloped and strictly convex for positive values of K and 4, To show this, 
let us first find the expressions for the marginal products Q, and Qx . Using the notation 
[---] a8 a shorthand for [6K~° + (1 — 8)L7?], we have 


ena (-;) [Tt —ay—pye 








aL 
=(1-8)4,-] “(4ere 7+) 
=(l- pean _ Jerr p40) 
= 
_ Itp 
- c (¢) +0 [by (12.63)] (12,64) 
and similarly, 
a 5 We 
ox=2-=(2) >0 (12.65) 


1K. |. Arrow, H. B. Chenery, B. S. Minhas, and R. M. Solow, "Capital-Labor Substitution and Economic 
Efficiency,” Review of Economics and Statistics, August 1961, pp. 225-250. 


398 Part Four Optimization Problems 


which are defined for positive values of K and £. Thus the slope of isoquants (with K 
plotted vertically and Z horizontally) is 


ak 1-8) ¢K\'" 

—= oe = Ad (K <0 — [see (11.36}] (12.66) 

dL Ok 8 OAL 

It can then be easily checked that d?K /d L? > 0 (which we leave to you as an exercise), 
implying that the isoquants are strictly convex for positive K and L. 

{t can also be shown that the CES production function is strictly quasiconcave for posi- 
tive K and L. Further differentiation of (12.64) and (12.65) shows that the second deriva- 
tives of the function have the following signs: 

_ = 8 +e) al O.L-O 
<0 


a 
Ou = 57 Or= Ae (2 Be 





[O:L -— Q < 0, by Euler’s theorem] 


0 
[QxK - G <0, by Euler’s theorem] 





a ~He0 (2)! QeK-9 | 


One = ae Ox = 4 K Re 


_ eo 
(=A +2) (2) Ox 4 


Ox. = Qux = 7 tT) 


These derivative signs, valid for positive K and Z, enable us to check the sufficient condi- 
tion for strict quasiconcavity (12.26). As you can verify, 
|Bil=—O% <0 
and |Bo| = 20x 01, On. — 0 Orn — OF Orr >0 


Thus the CES function is strictly quasiconcave for positive K and L, 

Last, we shall use the marginal products in (12.64) and (12.65) to find the elasticity of 
substitution of the CES function. To satisfy the least-cost combination condition 
O1/Ox = Pr /Px, where P;, and Px denote the prices of labor service (wage rate) and 
capital service (rental charge for capital goods), respectively, we must have 


1-8 fK\'? OP, [see (12.66)] 
—|[-— =— see (12. 
éOAL Px ( 


Thus the optimal input ratio is (introducing a shorthand symbol c) 


Ke § Vitl+e) P, Wl ted P, Vtl+ph 
= = 67" 
(7) (5) (z) (5) Men 


Taking (K*/L") to be a function of (P;/Px), we find the associated marginal and average 
functions to be 





Marginal function = 





d(Kt/ht) c eye 
d(P, {Px} 1+ \Pq 


Kyi" P, yd te-4 
Average furtction = / =c (+) 
Pi Px Px 





Example 3 


Chapter 12 Optimization with Equality Constraints 399 


Therefore the elasticity of substitution ist 
Marginal functi 1 
= (12.68) 
Average function 1+ 


What this shows is that o is a constant whose magnitude depends on the value of the 
parameter p as follows: 


-l<p<0O o>] 
p=0 => g=l 
V<p<ma a<l 


Cobb-Douglas Function as a Special Case of the CES Function 

In this last result, the middle case of ¢ = 0 leads to a unitary elasticity of substitution which, 
as we know, is characteristic of the Cobb-Douglas function. This suggests that the (lincarly 
homogeneous) Cobb-Douglas function is a special case of the (linearly homogeneous) CES 
function. The difficulty is that the CES function, as given in (12.63), is undefined when 
p =D, because division by zero is not possible. Nevertheless, we can demonstrate that, as 
p — 0, the CES function approaches the Cobb-Douglas function, 

For this demonstration, we shall rely on a technique known as L’Hépiial s rule. This rule 
has to do with the evaluation of the limil of a function f(x) = _ as x > a (where a 
can be either finite or infinite}, when the numerator a(x) and the denominator n(x} either 
(1) both tend to zero as. x > a, thus resulting in an expression of the 0/0 form, or (2) both 
tend to Loo asx > a, thus resulting in an expression in the form of c0/00 (or 90/— oc, oF 
—00/00, or —00/— oc), Even though the limit of f(x) cannot be evaluated as the expres- 
sion stands under these two circumstances, its value can nevertheless be found by using the 
formula 


tim 2) _ tim 


ra n(x) +4 a'(a) 





[L'Hépital’s rule] (12.69) 


Find the limit of (1 — x2)/(1 — x) as x + 1. Here, both m (x) and n(x) approach zero as x 
approaches unity, thus exemplifying circumstance (1). Since m'(x) = ~2x and n'(x) = -1, 
we can write 
— x2 _ 
* TS a lim 2x =2 
- rol 


lim 
wi l-x 





= li 
we 


1 


This answer is identical with that obtained by another method in Example 2 of Sec. 6.4. 
¥ Of course, we could also have obtained the same result by first taking the logarithms of both sides 


of (12.67): 
Kt 1 PL 
In (=) =Inc+ itp In (s) 


and then applying the formula for elasticity in (10.28), to get 


_ d(ink/L*) Jt 
~ d(nPi/Pe) +e 


400 Part Four Optimization Problems 


Example 4 


Find the limit of (2x + 5)/(x + 1) as x oc. When x becomes infinite, both m (x) and 1 (x) 
become infinite in the present case; thus we have here an example of circumstance (2). 
Since m'(x) = 2 and n‘(x) = 1, we can write 


Again, this answer is identical with that obtained by another method in Example 3 of 
Sec. 6.4. 


It may turn out that the right-side expression in (12.69) again falls into the 0/0 or 
the oc/oo format, same as the left-side expression. In such an event, we may reapply 
L6pital’s rule, Le. we may look for the limit of m"(x)/n"(x) as x — a, and take that 
limit as our answer. It may also turn out that even though the given function /(x), whose 
limit we wish to evaluate, is originally not in the form of m(x}/n(x) that falls into the 0/0 
of the 90/00 format upon limit-taking, a suitable transformation will make /(.) amenable 
to the application of the rule in (12.69). This latter possibility can be illustrated by the 
problem of finding the limit of the CES function (12.63)--now viewed as a function 
O(p)— as p > 0. 

As given, Q(p) is not in the form of m(p)/n(p). Dividing both sides of (12.63) by 4, 
and taking the natural log, however, we do get an expression in that form, namely, 


in 2 = TMK +L] _ mt) 
4” p -n(p) 





(12.70) 


Moreover, as p > 0, we find that m(e) > —In(é+1—-4) =—Inl = 0, anda{p) > 0, 
too, Thus L’Hépital’s rule can be used to find the limit of In(Q/4). Once that is done, the 
limit of Q can also be found: since Q/A = el" 2/4), so that Q = Ae 2), jt follows that 


lim Q = tim Ae" 2/4) = gel QI (12.71) 
From (12.70), let us first find m’() and 2’(p), as required by L’Hépital’s rule. The latter 
is simply n'(p) = 1. The former is 


-l d 
gem gt gy chain rule 
KF DE dp fk" +(1—8)L-?] [chain rule] 
_ [8K Ink - (1 8)L Pn] 
~ GK +ALL 
By LT6pital’s rule, therefore, we have 
. QO. mie) bmK+(t-ine 
lim In= = lim = 
p>0 A ps0 n'(p) 1 


mp) = 


[by (10.21')] 





= In(K8L! 5) 





In view of this result, when ¢ is raised to the power of lim In(Q/ A), the outcome is simply 
oo 


K*L'-*, Hence, by (12.71), we finally arrive at the result 
lim Q = AKeL'> 
p30 


showing that, as p > 0, the CES function indeed tends to the Cobb-Douglas function. 


EXERCISE 12.7 
1. 


Chapter 12 Optimization with Equality Constraints 401 





Suppose that the isoquants in Fig. 12,96 are derived from a particular homogeneous 
production function Q = Q(a, 5). Noting that Of = EE’ = E’E”, what must be the 
ratios between the output levels represented by the three isoquants if the function Q is 
homogeneous 

(a) of degree one? (b) of degree twa? 


For the generalized Cobb-Douglas case, if we plot the ratio bt/at against the ratio 


Po/ Pp, what type of curve will result? Does this result depend on the assumption that 
a+ B = 1? Read the elasticity of substitution graphically from this curve. 


. Is the CES production function characterized by diminishing returns to each input for 


all positive levels of input? 


. Show that, on an isoquant of the CES function, d?K /dL? > 0. 
. (a) For the CES function, if each factor of production is paid according te its marginal 


product, what is the ratio of labor's share of product to capital’s share of product? 
Would a larger value of 6 mean a larger relative share for capital? 

(b) Far the Cobb-Douglas function, is the ratio of labor's share to capitals share de- 
pendent on the K/L ratio? Does the same answer apply to the CES function? 


, (a) The CES production function rules out ¢ = —1, If 9 = —1, however, what would be 


the general shape of the isoquants for positive K and 1? 
(b) ts o defined for p = —1? What is the limit of « as p > —-1? 
(q) Interpret economically the results for parts (@} and (b). 


. Show that by writing the CES function as Q= A[éK~#+ (1 - Sey, where r > 0 


is anew parameter, we can introduce increasing retums to scale and decreasing returns 
to scale, 


. Evaluate the following: 
2 











,  x*—x-12 _ oe 
@) ti a © J, 

_ e . Inx 
@ IN (@) jim = 


. By use of L’HOpital’s rule, show that 
n 


im = q = ity x 
@ jim 5 =9 {by lim, xinx a ( im x =1 


Chapter | 3 





Further Topics in 
Optimization 


This chapter deals with two major topics. The first is nonlinear programming, which 
extends the techniques of constrained optimization of Chap. 12 by allowing inequality con- 
straints into the problem. In Chap, 12, the constraints must be satisficd as strict equalities; 
i.e., the constraints are always binding. Now we shall consider constraints that may not be 
binding in the solution; i.e., they may be satisfied as inequalities in the solution. 

In the second part of this chapter, we revert back to the realm of classical-constrained 
optimization to discuss some topics left untouched in the previous chapters. These include 
the indirect objective function, the envelope theorem, and the concept of duality, 





13.1 Nonlinear Programming and Kuhn-Tucker Conditions 





402 


In the history of methodological development, the first attempts at dealing with inequality 
constraints were concentrated on linear ones only. With linearity prevailing in the con- 
straints as well as in the objective function, the resulting methodology is quite naturally 
christened linear programming, Despite the limitation of linearity, however, we could for 
the first time, explicitly specify the choice variables to be nonnegative, as is appropriate in 
most economic analysis. This represents a significant advance. Nonlinear programming, & 
later development, makes it possible even to handle nonlinear inequality constraints and 
nonlinear objective function. Thus it occupies a most important place in optimization 
methodology. 

In the classical optimization problem, with no explicit restrictions on the signs of the 
choice variables, and with no inequalities in the constraints, the first-order condition for 
a relative or local extremum is simply that the first partial derivatives of the (smooth) 
Lagrangian function with respect to all the choice variables and the Lagrange multipliers 
be zero. In nonlinear programming, there exists a similar type of first-order condition, 
known as the Kuhn-Tueker conditions.' As we shall see, however, while the classical first- 
order condition is always necessary, the Kuhn-Tucker conditions cannot be accorded the 





tH. W, Kuhn and A. W, Tucker, “Nonlinear Programming,” in |. Neyman (ed.}, Proceedings of the 
Second Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, 
Berkeley, California, 1951, pp. 481-492, 


FIGURE 13.1 


Chapter 13 Further Topics in Optimization 403 


status of necessary conditions unless a certain proviso is satisfied. On the other hand, under 
certain specific circumstances, the Kuhn-Tucker conditions tur out to be sufficient condi- 
tions, or even necessary-and-sufficient conditions as well. 

Since the Kuhn-Tucker conditions are the single most important analytical result in non- 
linear programming, it is essential to have a proper understanding of these conditions as 
well as their implications. For the sake of expository convenience, we shall develop these 
conditions in two steps. 


Step 1: Effect of Nonnegativity Restrictions 


As the first step, consider a problem with nonnegativity restrictions on the choice variables, 
but with no other constraints. Taking the single-variable case, in particular, we have 
Maximize x= f(x) (13.1) 
subject to 20 

where the function fis assumed to be differentiable. In view of the restriction x1 > 0, three 
possible situations may arise. First, if a local maximum of 7 occurs in the interior of the 
shaded feasible region in Fig. 13.1, such as point A in Fig. 13.12, then we have an interior 
solution. The first-order condition in this case is dst/dx, = f"(x,) = 0, same as in the clas- 
sical problem. Sccond, as illustrated by point B in Fig. 13.14, a local maximum can also 
occur on the vertical axis, where x, = 0. Even in this second case, where we have a bound- 
ary solution, the first-order condition f“(1,) = 0 nevertheless remains valid. However, as a 
third possibility, a local maximum may in the present context take the position of point C 
or point D in Fig. 13.1¢, because to qualify as a local maximum in problem (13.1), the can- 
didate point merely has to be higher than the neighboring points within the feasible region, 
In view of this last possibility, the maximum point in a problem like (13.1) can be charac- 
terized, not only by the equation f"(x,) = 0, but also by the inequality f“(x,) < 0. Note on 
the other hand, that the opposite inequality f“(x,) > 0 can safely be rulcd out, for at a point 
where the curve is upward-sloping, we can never have a maximum, even if that point is 
located on the vertical axis, such as point £ in Fig. 13.la. 

The upshot of the preceding discussion is that, in order for a value of x; to give a local 
maximum of in problem (13.1), it must satisfy one of the following three conditions 


fyy=0 and x, >0 [point 4] (13.2) 
fi@ay=0 and x, =0 [point B] (13.3) 
f'n) <0 and = x)= 0 — [points and D] (13.4) 


T= fix) we fin) a= fixy) 








(@) (b) te) 


404 Part Four Optimization Problems 


Actually, these three conditions can be consolidated into q single statement 
fiajs0 om20 and x f(uy=0 (13.5) 


The first inequality in (13.5) is a summary of the information regarding f“(x,) enumer- 
ated in (13.2) through (13.4). The second inequality is a similar summary for 41; in fact, 
it merely reiterates the nonncgativity restriction of the problem. And, as for the third part 
of (13.5), we have an equation which expresses an important feature common to (13.2) 
through (13.4), namely, that of the two quantities x; and /"(x)), af east one must take a zero 
value, so that the product of the two must be zero. This feature is referred to as the compli- 
mentary slackness between x, and _f’(x1). Taken together, the three parts of (13.5) constitute 
the first-order necessary condition for a local maximum in a problem where the choice vari- 
able must be nonnegative. But going a step further, we can also take them to be necessary for 
a global maximum. This is because a global maximum must also be a local maximum and, 
as such, must also satisfy the necessary condition for a local maximum. 

When the problem contains 4 choice variables: 

Maximize = f (4%), X25 ---5%y) 


13.6 
subject to x20 (f= 1,2,...,7) ( ) 


The classical first-order condition f) = 2 =-++ = fy = 0 must be similarly modified. To 
do this, we can apply the same type of reasoning underlying (13.5) to each choice variable 
x; taken by itself, Graphically, this amounts to viewing the horizontal axis in Fig. 13.1 as 
representing each x; in turn. The required modification of the first-order condition then 
readily suggests itself: 


x0 x20 and xf) =0 G=1,2,...,.") (13.7) 


2 


where /; is the partial derivative 2 /8x;. 


Step 2: Effect of Inequality Constraints 
With this background, we now proceed to the second step, and try to include inequality 
constraints as well. For simplicity, let us first deal with a problem with three choice vari- 
ables (n = 3) and two constraints (m = 2): 
Maximize x = fei, x2,%3) 
subject to glx, 42,53) SF 
, ; (13.8) 
g (41, 42,33) S72 
and X4, 42,3 20 
which, with the help of two dummy variables s; and 52, can be transformed into the equiv- 
alent form 
Maximize ox = f(x1,.42..43) 
subject to ‘Ory, 2,03) +9) =F 
I £ J+s=n «13.89 
£8 F203) +82 = 


and X1. 420035 81,52 2 0 


Chapter 13 Further Topics in Optimization 405 


If the nonnegativity restrictions are absent, we may, in line with the classical approach, 
form the Lagrangian function: 
2! = fais ta 8s) +A — 8! a2. 3) - 9] 
+ dalra — 2481, 42, 43) — 32] (13.9) 
and write the first-order condition as 
az! az! az! az! _ a _ az! _ at’, 
ax, Ox, 8x3 BS, OAS 
But since the x; and s; variables do have to be nonnegative, the first-order condition on 
those variables should be modified in accordance with (13.7). Consequently, we obtain the 
following set of conditions instead: 


az’ az’ 
: <0 x20 and 





=0 


az! oe 
ax, 4 ax, 

az’ az’ 

e020) and ee =0 (13.10) 
as; Os; 

az’ _ 

9 i 1,2 

aay fHl23 


Note that the derivatives 42’/0A; are still to be set strictly equal to zero. (Why?) 

Each line of (13.16) relates to a different type of variable. But we can consolidate the 
last two lines and, in the process, eliminate the dummy variable s; from the first-order con- 
dition. Inasmuch as 32Z'/8s; = —A,, the second line of (13.10) tells us that we must have 
—A; = 0, 8; = 0, and —s;A; = 6, ot equivalently, 


20 420 and 5,4, =0 (13.11) 








But the third line—a restatement of the constraints in (13.8'}—means that 5; = 6; — 
2! (1, ¥2, ¥3). By substituting the latter into (13.1 1}, therefore, we can combine the second 
and third lines of (13.10) into 
n-S,x203) 20 Ae O and Afr; ~ g'(x1,22,43)]=0 

This enables us to express the first-order condition (13.10) in an equivalent form without 
the dummy variables. Using the symbol g; to denote dg' /3.x;, we now write 

az’ 

iy = fy - (ig) + Arg?) <0 x20 and 


re — x1, x2, 3) > 0 4,20 and 





(13.12) 


These, then, are the Kuhn-Tucker conditions for problem (13,8), or, more accurately, one 
version of the Kuhn-Tucker conditions, expressed in terms of the Lagrangian function 7’ 
in (13.9). 

Now that we know the results, though, it is possible to obtain the same sct of conditions 
more directly by using a different Lagrangian function. Given the problem (13.9), let us 
ignore the nonnegativity restrictions as well as the inequality signs in the constraints and 
write the purely classical type of Lagrangian function Z: 


Z = F(xy32, 53) talr — g(a, a2, 35)] + dary — g? (1, 22, 43)] (13.13) 


406 Part Four (Optimization Problems 


Example 1 


Then let us do the following: (1) set the partial derivatives JZ /x; <0, bul 6Z/d4; > 0. 
(2} impose nonnegativity restrictions on x, and 4, and (3) require complementary 
slackness to prevail between each variable and the partial derivative of Z with respect to 
that variable, that is, require their product to vanish. Since the results of these steps, 
namely, 








az - 

i f-(ngh+iag?) <0 x20 and xy ax =0 

- / (13.14) 
az ; . az 

Fe TTR aa) BO Ay 20 and in, 7° 


até identical with (13.12), the Kuhn-Tucker conditions are expressible also in terms of the 
Lagrangian function Z (as against Z'). Note that, by switching from Z’ to Z, we can not only 
arrive at the Kuhn-Tucker conditions more directly, but also identify the expression 
yj — g'(X, 12, ¥3) which was left nameless in (13.12}—as the partial derivative 0Z/0A,. 
In the subsequent discussion, therefore, we shall only use the (13.14) version of the Kuhn- 
Tucker conditions, based on the Lagrangian function Z. 


If we cast the familiar problem of utility maximization into the nonlinear programming 
mold, we may have a problem with an inequality constraint as follows: 

Maximize U =UCx, y) 

subject to P,x+ PyysB 

and xy20 
Note that, with the inequality constraint, the consumer is no longer required to spend the 
entire amount B. 

To add a new twist to the problem, however, let us suppose that a ration has been im- 
posed on commodity x equal to X. Then the consumer would face a second constraint, and 
the problem changes to 

Maximize U=UG,y) 

subject to Pyx+ Pyy < B 

x= Xo 

and x y2d 

The Lagrangian function is 
Z=aUCx, y)+a1(B — Pex — Pyy) + A2(Xo - X) 
and the Kuhn-Tucker conditions are 
Zp = Uy — Pray — 42 <0 x>0 and xZ,=0 


Z,=U,- Py <0 y20 and yZy=0 
Zi, =B-Pryy-Pyy20 p20 and = AZ, =0 
Zi = Xp -x>0 ag20 and = 42Z,, =0 


It is useful to examine the implications of the third column of the Kuhn-Tucker condi- 
tions. The condition 4 Z,, =, in particular, requires that 


AY(B - Pyx — Pyy) =0 


Chapter 13 Further Tupies in Optimization 407 


Therefore, we must have either 
A, =0 or B-P,x- Pyy=0 


(f we interpret 41 as the marginal utility of budget money (income), and if the budget con- 
straint is nonbinding (satisfied as an inequality in the solution, with money left over), the 
marginal utility of 8 should be zero (A; = 0). 
Similarly, the condition 42 2,, = 0 requires that either 
Ar =0 or Xo-x=O 


Since Az can be interpreted as the marginal utility of relaxing the constraint, we see that if 
the ration constraint is nonbinding, the marginal utility of relaxing the constraint should be 
zero (42 = Q). 
This feature, referred to as complementary slackness, plays an essential role in the search 
for a solution. We shall now illustrate this with a numerical example: 
Maximize U = xy 
subjectto x+y<100 
x<40 
and x,y20 


The Lagrangian is 
Z = xy +21(100 — x — y) +. 42(40 - x) 
and the Kuhn-Tucker conditions become 


Z,= Yh 22 50 x20 and xZ,=0 


Zj=x-h 50 =O and yZ,=0 
Z,, = 100-x-y>0 Ay 20 and 42, =0 
Zi, =40-x20 42>0 and iat, =O 


To salve a nonlinear programming problem, the typical approach is one of trial and 
error. We can, for example, start by trying a zero value for a chaice variable. Setting a vari- 
able equal to zero always simplifies the marginal conditions by causing certain terms to 
drop out. If appropriate nonnegative values of Lagrange multipliers can then be found that 
satisfy all the marginal inequalities, the zero solution will be optimal. If, on the other hand, 
the zero solution violates some of the inequalities, then we must let one or more choice vari- 
ables be positive. For every positive choice variable, we may, by complementary slackness, 
convert a weak inequality marginal condition into a strict equality. Properly solved, such an 
equality will lead us either to a solution, or to a contradiction that would then compel us to 
try something else. If a solution exists, such trials will eventually enable us to uncover it. We 
can also start by assuming one of the constraints to be nonbinding. Then the related 
Lagrange multiplier will be zero by complementary slackness and we have thus eliminated 
a variable. If this assumption leads to a contradiction, then we must treat the said constraint 
as a strict equality and proceed on that basis. 

For the present example, it makes no sense to try x = 0 or y = 0, for then we would have 
U = xy = 0. We therefore assume both x and y to be nonzero, and deduce Z, = Z,=0 
from complementary slackness, This means 


y-A—az=x-Ay(=90) 


so that yrag=x. 


408 Part Four Optimization Problems 


Now, assume the ration constraint to be nonbinding in the solution, which implies that 

=0. Then we have x=y, and the given budget 8 = 100 yields the trial solution 
x= y= 50, But this solution violates the ration constraint x <= 40. Hence we must adopt 
the alternative assumption that the ration constraint is binding with x* = 40. The budget 
constraint then allows the consumer to have y* = 60. Moreover, since complementary 
slackness dictates that Z, = Zy = 0, we can readily calculate that 4} = 40, and 43 = 20. 


Interpretation of the Kuhn-Tucker Conditions 
Parts of the Kuhn-Tucker conditions (13.14) are merely a restatement of certain aspects of 
the given problem. Thus the conditions x; > 0 merely repeat the nonnegativity restrictions, 
and the conditions @Z /32; > 0 merely reiterate the constraints, To include these in (13.14), 
however, has the important advantage of revealing more clearly the remarkable symmetry 
between the two types of variables, x; (choice variable) and 4; (Lagrange multipliers). To 
each variable in each category, there corresponds a marginal condition—dZ/dx; = 0 or 
92/94; > 0- to be satisfied by the optimal solution. Each of the variables must be non- 
negative as well. And, finally, cach variable is characterized by complementary slackness in 
relation to a particular partial derivative of the Lagrangian function Z. This means that, for 
each x), we must find in the optimal solution that either the marginal condition holds as an 
equality, as in the classical context, or the choice variable in question must take a zero 
value, or both. Analogously, for each A;, we must find in the optimal solution that eiher 
the marginal condition holds as an equality—meaning that the ith constraint is exactly 
satisfied---or the Lagrange multiplier vanishes, or both. 

An even more explicit interpretation is possible when we look at the expanded expres- 
sions for 32 /8x, and 8Z/8A; in (13.14), Assume the problem to be the familiar production 
problem. Then we have 


= marginal gross profit of jth product 


os 
l 


shadow price of ith resource (ihe opportunity cost of using a unit of the 
ith resource) 


> 
Ul 


g) = amount of ith resource used up in producing the marginal unit of jth product 


dig; = marginal imputed cost of ith resource incurred in producing a unit of 
jth preduct 


> dig; = aggregate marginal imputed cost of jth product 


Thus the marginal condition 


aZ 
ax; ah dng <0 


requires that the marginal gross profit of the jth product be no greater than its aggregate 
marginal imputed cost; i.e, no xaderimputation is permitted. The complementary- 
slackness condition then means that, if the optimal solution calls for the active production 
of the jth product (x? > 0), the marginal gross profit must be exactly equal to the aggregate 
marginal imputed cost ( (02 fax = = 0), as would be the situation in the classical optimiza- 
tion problem. If, on the other hand, the marginal gross profit optimally falls short of the ag- 
gregate imputed cost (0Z/3x; < 0), entailing excess imputation, then that product must 


Chapter 13 Further Topics in Optimization 409 


not be produced (a} = 0).' This latter situation is something that can never occur in the 
classical context, for if the marginal gross profit is less than the marginal imputed cost, then 
the output should in that framework be reduced all the way to the level where the marginal 
condition is satisfied as an equality. What causes the situation of 02 /0x7 < 0 to qualify as 
an optimal one here, is the explicit specification of nonnegativity in the present framework. 
For then the most we can do in the way of output reduction is to lower production to the 
level x7 = 0, and if we still find JZ /dx; < 0 at the zero output, we stop there anyway. 

As for the remaining conditions, which relate to the variables A;, their meanings are even 
easier to perceive, First of all, the marginal condition 0Z/da; > 0 merely requires the firm to 
stay within the capacity limitation of every resource in the plant. The complementary-slackness 
condition then stipulates that, if the ith resource is not fully used in the optimal solution 
(8Z/82F > 0), the shadow price of that resource—which is never allowed to be negative--~- 
must be set cqual to zero (A = 0). On the other hand, ifa resource has a positive shadow price 
in the optimal solution (47 > 0), then it is perforce a fully utilized resource (82 /dAf = 0). 

Tt is also possible, of course, to take the Lagrange-multiplier value A; to be a measure 
of how the optimal value of the objective function reacts to a slight relaxation of the ith 
constraint. In that light, complementary slackness would mean that, if the ith constraint is 
optimally not binding (82/447 > 0), then relaxing that particular constraint will not affect 
the optimal value of the gross profit (47 = 0)—just as loosening a belt which is not con- 
stricting one’s waist to begin with will not produce any greater comfort. If, on the other 
hand, a slight relaxation of the ith constraint (increasing the endowment of the ith resource) 
does increase the gross profit (A? > 0), then that resource constraint must in fact be bind- 
ing in the optimal solution (9Z/dA* = 0). 


The n-Variable, m-Constraint Case 

The preceding discussion can be generalized in a straightforward manner to when there are 
n choice variables and m constraints. The Lagrangian function Z will appear in the more 
general form 





Z = f(y kay Mn) + Saln = 881082, 00654n)) (13,15) 


isl 


And the Kuhn-Tucker conditions will simply be 


az az 

= <0 ww 20 and x,—=0 [maximization] 

ed 4 (13.16) 
az az i=1,2 , 
“30 420 and 24, =0 a1 

ayy = pee an hy (; =1,2, 





Here, in order to avoid a cluttered appearance, we have not written out the expanded 
expressions for the partial derivatives 8 Z/8x, and 3Z/42,. But you are urged to write them 
out for a more detailed view of the Kuhn-Tucker conditions, similar to what was given in 
(13.14). Note that, aside from the change in the dimension of the problem, the Kubn-Tucker 
conditions remain entirely the same. The interpretation of these conditions should naturally 
also remain the same. 


* Remember that, given the equation ab = 0, where a and p are real numbers, we can legitimately 
infer that @ # 0 implies b = 0, but it is not true that a = 0 implies b # 0, since b= 0 is also consistent 
witha=0. 


410 Part Four Optimization Problems 


What if the problern is onc of minimization? One way of handling it is to convert the 
problem into a maximization problem and then apply (13.6). To minimize C’ is equivalent 
10 maximizing —C, so such a conversion is always feasible. But we must, of course, also re- 
verse the constraint inequalities by multiplying every constraint through by — 1, Instead of 
going through the conversion process, however, we may—again using the Lagrangian func- 
tion Z as defined in (13.15)—directly apply the minimization version of the Kubn-Tucker 
conditions as follows: 

aZ az 


—>=0 x20 and xj =0 [minimization] 

a (13.17) 
az i= . 
<0 420 and “30 PaV2um 

OA; , Fly. 





This you should compare with (13. 

Reading (13.16) and (13.17) horizontally (rowwise), we see that the Kuhn-Tucker condi- 
tions for both maximization and minimization problems consist of a set of conditions relating 
to the choice variables x, (first row) and another set relating to the Lagrange multipliers 4; 
(second row). Reading them vertically (columnvwise) on the other hand, we note that, for each 
x, and A;, there is a marginal condition (first column), a nonnegativity restriction (second 
column), and a complementary-slackness condition (third column), In any given problem, 
the marginal conditions pertaining to the choice variables always differ, as a group, from the 
marginal conditions for the Lagrange multipliers in the sense of inequality they take. 

Subject to the proviso to be explained in Sec. 13.2, the Kubn-Tucker maximum condi- 
tions (13.16} and minimum conditions (13.17) are necessary conditions for a local maxi- 
mum and local minimum, respectively. But since a global maximum (minimum) must also 
be a local maximum (minimum), the Kuhn-Tucker conditions can also be taken as neces- 
sary conditions for a global maximum (minimum), subject to the same proviso. 








Example 2— Let us apply the Kuhn-Tucker conditions to solve a minimization problem: 
OO Minimize C= (4) -4)*+02-4)? 
subjectto = 2x1 + 3x9 > 6 
—3x, — 2x2 = -12 
and xy, x 2 0 


The Lagrangian function for this problem is 
Z = (4 — 49 + (ag — 4)? + A (6 = Dor — Baa) + Aa(-12 + 3a + 2x) 
Since the problem is one of minimization, the appropriate conditions are (13.17), which 
include the four marginal conditions 


az 
= 22m — 4) — 2 +34 0 
oxy 


82 _ a) — 4) — 30 + 2g 20 


a. 

3 (13.18) 
— =6-2x, -3x, 20 

dn 

az 

s— =-124 3%, +2, <0 

Qh +3) + 2X7 = 


plus the nonnegativity and complementary-slackness conditions. 


Chapter 13. Further Topics in Opiimizatton 411 


To find a solution, we again use the trial-and-error approach, realizing that the first few 
trials may lead us into a blind alley. Suppose we first try 4) > 0 and A2 > 0 and check 
whether we can find corresponding x; and x2 values that satisfy both constraints. With 
positive Lagrange multipliers, we must have 92/94, = 92/82 = 0. From the last two lines 
of (13.18), we can thus write 


2x) + 3x =6 and 3x, + 2x2 = 12 


These two equations yield the trial solution x, = “ and x) = -1 i which violates the 
nonnegativity restriction on x2. 

Let us next try x1 > 0 and x2 > 0, which would imply 82 fax = 0Z/dx2 = 0 by comple- 
mentary slackness, Then, from the first two lines of (13.18), we can write 

2(m —4)— 2h +342=0 9 and 2(xy 4) - 34) — i= 0 (13.19) 
Multiplying the first equation by 2, and the second equation by 3, then subtracting the lat- 
ter from the former, we can eliminate 42 and obtain the result 
4x, — 6X2 +541 +8=0 


By further assuming 4; = 0, we can derive the following relationship between x; and x2: 
3 
M52 (13.20) 


In order to solve for the two variables, however, we need another relationship between x1 
and x2, For this purpose, Jet us assume that 42 # 0, so that 2/442 =O. Then, from the last 
two lines of (13.18), we can write (after rearrangement) 
3x) + 2x2 = 12 (13.21) 
Together, (13.20) and (13.21) yield another trial solution 
28 2 36 10 
n= 95 (=255) > = 75 (=273) > 0 

Substituting these values into (13.19), and solving for the Lagrange multipliers, we get 

16 3 

ar =9 n= 3 (=193) > 9 
Since the solution values for the four variables are all nonnegative and satisfy both con- 
straints, they are acceptable as the final solution. 


EXERCISE 13.1 


1. Draw a set of diagrams similar to those in Fig. 13.1 for the minimization case, and deduce 
a set of necessary conditions for a‘local minimum corresponding to (13.2) through 
(13.4). Then condense these:conditions into a single statement similar to (13.5). 

2. (a) Show that, in (13.16), instead of writing 

az 





Aon = i=1,...,m) 
G70 » 
as a set of m separate conditions, it is sufficient to write a single equation in the 
form of 
az =0 


i 
day 


412 Part Four Opsimisution Problems 


(5) Can we do the same for the following set of conditions? 
gO) Gah 
3. Based on the reasoning used in Prob. 2, which set (or sets) of conditions in (13.17) can 
be condensed into a single equation? 
4. Suppose the problem is 
Minimize C= F(x, 2, -- 1 Xn) 
subject to (xt, Xap. Xa} 2 


fi=1,2,...,m 
and x2 Garg"). 











Write the Lagrangian function, take the derivatives 02/3x, and 82/44; and write out 
the expanded version of the Kuhin-Tucker minimum conditions (13.17). 

5, Convert the minimization problem in Prob, 4 into a maximization problem, formulate 
the Lagrangian function, take the derivatives with respect to x) and 4;, and apply the 
Kuhn-Tucker maximum conditions (13.16). Are the results consistent with those 
obtained in Prob. 4? 


13.2 The Constraint Qualification 





Example 1 


The Kuha-Tucker conditions are necessary conditions anly ia particular proviso is satis- 
fied. That proviso, called the constraint qualification, imposes a certain restriction on the 
constraint functions of a nonlinear programming problem, for the specific purpose of rul- 
ing out certain irregularities on the boundary of the feasible set, that would invalidate the 
Kuhn-Tucker conditions should the optimal solution occur there. 


Irregularities at Boundary Points 
Let us first illustrate the nature of such irregularities by means of some concrete examples 


Maximize w=X 

subjectto x2 -(1—m)> <0 

and x, Xz > 0 

As shown in Fig. 13.2, the feasible region is the set of points that lie in the first quadrant 
‘on or below the curve xz = (1 — x1)°. Since the objective function directs us to maximize 
x1, the optimal solution is the point (1, 0). But the solution fails to satisfy the Kuhn-Tucker 
maximum conditions. To check this, we first write the Lagrangian function 
Zantiy[-ot 1-1} 

As the first marginal condition, we should then have 


aZ 

“~=1-340-my <0 

an Ai] — myo = 
In fact, since xf = 1 Is positive, complementary slackness requires that this derivative vanish 
when evaluated at the point (1, 0). However, the actual value we get happens to be 
¥Z/3x} = 1, thus violating the given marginal condition. 


FIGURE 13,2 


Example 2 


Chapter 13 Further Topics in Optimization 413 





The reason for this anomaly is that the optimal solution (1, 0) occurs in this example at 
an outward-pointing cusp, which constitutes one type of irregularity that can invalidate the 
Kuhn-Tucker conditions at a boundary optimal solution. A cusp is a sharp point formed 
when a curve takes a sudden reversal in direction, such that the slope of the curve on one 
side of the point is the same as the slope of the curve on the other side af the point. Here, 
the boundary of the feasible region at first fallows the constraint curve, but when the point 
(1, 8) is reached, it takes an abrupt turn westward and follows the trail of the horizontal axis 
thereafter. Since the slopes of both the curved side and the horizontal side of the boundary 
are zero at the point (1, 0), that point is a cusp. 

Cusps are the most frequently cited culprits for the failure of the Kuhn-Tucker conditions, 
but the truth is that the presence of a cusp is neither necessary nor sufficient to cause those 
conditions to fail at an optimal solution. Examples 2 and 3 will confirm this. 


To the problem of Example 1, let us add a new constraint 
2x, +42 <2 


whose border, x2 = 2 — 2x1, plots asa straight line with slope —2 which passes through the 
optimal point in Fig. 13.2. Clearly, the feasible region remains the same as before, and so 
does the optimal solution at the cusp. But if we write the new Lagrangian function 


Zax +hil-x2 +1 — T+ hal2 - 2&1 — x2] 
and the marginal conditions 
az 
Ox, 
az 
ax 
az 
OAq 
az 
dar 


=1-3ij(1—m)? - 242 <0 


=-aAy-42<0 





=-m+t(-mpeo 


=2-2x1- 220 


414 Part Four 9 Qptimizetion Prolene 


Example 3 


FIGURE 13.3 


it turns out that the values x7 = 1, x5 = 0, 47 = 1, and A} = 4 do satisfy these four inequal- 
ities, as well as the nonnegativity and complementary-slackness conditions. As a matter of 
fact, At can be assigned any nonnegative value (not just 1), and all the conditions can sti 
be satisfied—which goes to show that the optimal value of a Lagrange multiplier is not 
necessarily unique. More importantly, however, this example shows that the Kuhn-Tucker 
conditions can remain valid despite the cusp. 


The feasible region of the problem 


Maximize = x =x2- x? 
3 
subject to -(10 ~ xp = 2) 30 
hy 2 
and Hy, x2 = 0 


as shown in Fig. 13.3, contains no cusp anywhere. Yet, at the optimal solution, (2, 6), the 
Kuhn-Tucker conditions nonetheless fail to hold. For, with the Lagrangian function 


3 
Zan -x? +ar(10 oo x) + A(—2 +x) 
the second marginal condition would require that 


F 2 
= =1 ~ 3, (10. -x) <0 

Indeed, since x} is positive, this derivative should vanish when evaluated at the point (2, 6). 
But actually we get @Z/dx2 = 1, regardless of the value assigned to 4;. Thus the Kuhn- 
Tucker conditions can fail even in the absence of a cusp—nay, even when the feasible region 
is. a convex set as in Fig. 13.3. The fundamental reason why cusps are neither necessary nor 
sufficient for the failure of the Kuhn-Tucker conditions is that the preceding irregularities 
referred to before relate, not to the shape of the feasible region per se, but to the forms of 
the constraint functions themselves. 


xy =a 











Example 4 


Chapter 13° Furtivr Topics in Optinization 415 


The Constraint Qualification 
Boundary irregularities—cusp or no cusp will not occur if a certain constraint qualifica- 
tion is satisfied. 

To explain this, let x* = (x7, x3, ...,.x7) be a boundary point of the feasible region and 
a possible candidate for a solution, and let dx = (cx), dx9,..., dq) represent a particular 
direction of movement from the said boundary point. The direction-of-movement interpre- 
tation of the vector dx is perfectly in line with our earlier interpretation of a vector as a 
directed line segment (an arrow), but here, the point of departure is the point x* instead of 
the point of origin, and so the vector dx is nor in the nature of'a radius vector, We shall now 
impose two requirements on the vector dx, First, if the jth choice variable has a zero value 
at the point x", then we shall only permit a nonnegative change on the x; axis, that is, 


dy 20 if xp =0 (13.22) 


Second, if the ith constraint is exactly satisfied at the point x*, then we shall only allow val- 
ucs of dx),...,@X, such that the value of the constraint function g'(x") will not increase 
(for a maximization problem) or will not decrease (for a minimization problem), that is, 
<0 ig vet an 
> 0(min.} 

(13.23) 


where all the partial derivatives of g; are to be evaluated at x*. Ifa vector dx satisfies 
(13.22) and (13.23), we shall refer to it as a fest vector. Finally, if there exists a differen- 
tiuble arc that (1) emanates from the point x*, (2) is contained entirely in the feasible 
region, and (3) is tangent to a given test vector, we shall call ita qualifying arc for that test 
vector. With this background, the constraint qualification can be stated simply us follows: 


dgi(x") = gi dx) t+ ghdxy +o +g dX, 


The constraint qualification is satisfied if, for any point x* on the boundary of the feasible 
region, there exists a qualifying arc for every test vector dr. 


We shall show that the optimal point (1, 0) of Example 1 in Fig. 13.2, which fails the Kuhn- 
Tucker conditions, also fails the constraint qualification. At that point, x3 = 0; thus the test 
vector must satisfy 


dx, >Q [by (13.22)] 


Moreover, since the (only) constraint, g! = xp — (1 — 4) < 0, is exactly satisfied at (1, 0), 
we must let [by (13.23)] 


Gldx + ghdx = 301 — xt Pde ¢ dx = dx <0 


These two requirements together imply that we must let dx2 = 0. In contrast, we are free 
to choose dx;. Thus, for instance, the vector (dx1, dx2) = (2, 0) is an acceptable test vector, 
as is (dx1, dx2) = (-1, 0). The latter test vector would plot in Fig. 13.2 as an arrow starting 
from (1, 0} and pointing in the due-west direction (not drawn), and it is clearly possible to 
draw a qualifying arc for it. (The curved boundary of the feasible region itself can serve as a 
qualifying arc.) On the other hand, the test vector (dx1, 4x2) = (2,0) would plot as an 
arrow starting from (1, 0) and pointing in the due-east direction (not drawn). Since there is 
no way to draw a smooth arc tangent to this vector and lying entirely within the feasible 
region, no qualifying arcs exist for it. Hence the optimal solution point (1, 0) violates the 
constraint qualification. 


416 Part Four Optimization Problents 


Example 5 


Example 6 


FIGURE 13.4 


Referring to Example 2, let us illustrate that, after an additional constraint 2x1 + x2 <2 is 
added to Fig. 13,2, the point (1, 0) will satisfy the constraint qualification, thereby revali- 
dating the Kuhn-Tucker conditions. 

Asin Example 4, we have to require dx2 > 0 (because x3 = 0) and dx2 < 0 (because the 
first constraint is exactly satisfied); thus, dxz = 0. But the second constraint is also exactly 
satisfied, thereby requiring 


G dx + G5 xy = 2dx, 4 dky =2dxy <0 [by (13.23)] 


With nonpositive dx, and zero dx2, the only admissible test vectors—aside from the null 
vector itself—are those pointing in the due-west direction in Fig. 13.2 from (1, 0). All of 
these lie along the horizontal axis in the feasible region, and it is certainly possible to draw 
a qualifying arc for each test vector. Hence, this time the constraint qualification indeed is 
satisfied. 


Linear Constraints 

Earlier, in Example 3, it was demonstrated that the convexity of the feasible set does not 
guarantee the validity of the Kuhn-Tucker conditions as necessary conditions. [lowever, tf 
the feasible region is a convex set formed by /ivear constraints only, then the constraint 
qualification will invariably be met, and the Kuhn-Tucker conditions will always hold at an 
optimal solution, This being the case, we need never worry about boundary irregularities 
when dealing with a nonlinear programming problem with linear constraints. 


Let us illustrate the linear-constraint result in the two-variable, two-constraint framework. 
For a maximization problem, the linear constraints can be written as 


ad + 242 50 
Gz Xy + Ohax2 = 12 


where we shall take all the parameters to be positive. Then, as indicated in Fig. 13.4, the first 
constraint border will have a slope of —a; /a2 < 0, and the second, a slope of —ap1 /a2 < 0. 
The boundary points of the shaded feasible region fall into the following five types: (1) the 
point of origin, where the two axes intersect, (2) points that lie on one axis segment, such 








ayyry © ak | Fy 
(slope = - ay) /ays) 


N 


ayX, ays) =P 
(slope = ap, /e02) 








Chapter 13. Further Topics in Optimization 417 


as /and 5, (3) points at the intersection of one axis and one constraint border, namely, K and 
8, (4) points lying on a single constraint border, such as L and N, (5) the point of intersec- 
tion of the two constraints, Mf. We may briefly examine each type in turn with reference to 
the satisfaction of the constraint qualification. 


1, At the origin, no constraint is exactly satisfied, so we may ignore (13.23). But since 
X1 = Xz = 0, we must choose test vectors with dxy = 0 and dx2 = 0, by (13.22). Hence 
all test vectors from the origin must point in the due-east, due-north, or northeast direc- 
tions, as depicted in Fig. 13.4. These vectors all happen to fall within the feasible set, and 
a qualifying arc clearly can be found for each. 

2. Ata point like , we can again ignore (13.23). The fact that x2 = 0 means that we must 
choose dx = 0, but our choice of dx; is free. Hence all vectors would be acceptable ex- 
cept those pointing southward (dxz < 0). Again all such vectors fall within the feasible 
region, and there exists a qualifying arc for each. The analysis of point S is similar. 

3. At points Kand R, both (13.22) and (13.23) must be considered, Specifically, at K, we have 
to choose dx; > O since xz = 0, so that we must rule out all southward arrows. The second 
constraint being exactly satisfied, moreover, the test vectors for point K must satisfy 


Gx; + G dx = ay) dey +02 dx < 0 (13.24) 


Since at K we also have ap1 x + d72x2 = rp (second constraint border), however, we may 
add this equality to (13.24) and modify the restriction on the test vector to the form 


aay (xy + dx1) + Goa(xy + dxg) = re (13.24) 


Interpreting (x; + dx;) to be the new value of x; attained at the arrowhead of a test 
vector, we may construe (13.24') to mean that all test vectors must have their arrow- 
heads located on or below the second constraint border, Consequently, all these vectors 
must again fall within the feasible region, and a qualifying arc can be found for each. The 
analysis of point & is analogous, 

4. At points such as ¢ and N, neither variable is zero and (13.22) can be ignored. However, 
for point N, (13.23) dictates that 


Gl dey + gh dxy = ay dy +argdxz <0 (13.25) 


Since point N satisfies a1 dx + a2 dx2 =r (first constraint border), we may add this 
equality to (13.25) and write 


air (Xr + Oxy) + ayalxg + dxg) Sr (13.25') 


This would require the test vectors to have arrowheads located on or below the first con- 
straint border in Fig. 13.4, Thus we obtain essentially the same kind of result 
encountered in the other cases, This analysis of point L is analogous. 

5. At point M, we may again disregard (13.22), but this time (13.23) requires all test vec- 
tors to satisfy both (13.24) and (13.25). Since we may modify the latter conditions to the 
forms in (13.24') and (73.25'), all test vectors must now have their arrowheads located 
on or below the first as well as the second constraint borders. The result thus again 
duplicates those of the previous cases. 


in this example, it so happens that, for every type of boundary point considered, the test 
vectors all Jie within the feasible region. While this locational feature makes the qualifying 
arcs easy to find, it is by no means a prerequisite for their existence. In a problem with a 


418 Part Four Oprimization Problems 


nonlinear constraint border, in particular, the constraint border itself may serve as a qualify- 
ing arc for some test vector that lies outside of the feasible region. An example of this can 
be found in one of the problems below. 





EXERCISE 13.2 


1. Check whether the solution point (xj, x3) = (2, 6) in Example 3 satisfies the constraint 
qualification. 

2. Maximize =z =% 
subjectto = x#-+ xh <1 
and x1,52 20 
Solve graphically and check whether the optimal-solution point satisfies (a) the con- 
straint qualification and (b) the Kuhn-Tucker conditions. 

3. Minimize C=x 
subjectto xf —x) = 0 
and x1,%2 20 
Solve graphically, Does the optimal solution occur at a cusp? Check whether the opti- 
mal solution satisfies (a) the constraint qualification and (b) the Kuhn-Tucker minimum 


conditions. 

4. Minimize Cah 
subjectto 9 -x» (I-20 
and x2 20 


Show that (a) the optimal solution (x}, x3) = (1, 0) does not satisfy the Kuhn-Tucker 
conditions, but (b) by introducing a new multiplier 49 = 0, and modifying the 
Lagrangian function (13.15) to the form 


it 
Ziq = ha F(X, 20, 001 ke) +) Al — gC, 0, Ha] 
ial 


the Kuhn-Tucker conditions can be satisfied at (1, 0). (Note: The Kuhn-Tucker condi- 
tions on the multipliers extend to only 41, ..., 4m, but not to Ao.) 


13.3 Economic Applications 





War-Time Rationing 
Typically during times of war the civilian population is subject to some form of rationing 
of basic consumer goods, Usually, the method of rationing is through the use of redeemable 
coupons used by the government. The government will supply each consumer with an 
allotment of coupons each month. In turn, the consumer will have fo redeem a certain num- 
ber of coupons at the time of purchase of a rationed good. This effectively means the con- 
sumer pays fwo prices at the time of the purchase. He or she pays both the coupon price and 
the monetary price of the rationed good. This requires the consumer to have both sufficient 
funds and sufficient coupons in order to buy a unit of the rationed good. 

Consider the case of a two-good world where both goods, ¥ and y, are rationed, Let the 
consumer’s utility function be U = U(x, y}. The consumer has a fixed moncy budget of 8 


Example 1 


Chapter 13. Further Topies in Optimization 419 


and faces exogenous prices ?, and P,. Further, the consumer has an allotment of coupons, 
denoted C, which can be used to purchase either x or y at a coupon price of ¢, and cy. 
Therefore the consumer’s maximization problem is 
Maximize U = U(x, y) 
subject to PxtPyeB 
ext gy <sC 
and xy>d 
The Lagrangian for the problem is 
Z = U(x, y) + Ai(B — Per — Pyy) + hal C — eek + eyy) 
where 4; and Aj are the Lagrange multipliers. Since both constraints are linear, the con- 
straint qualification is satisfied and the Kuhn-Tucker conditions arc necessary: 
Z, HU, ~A Pe doe $0 x>0 xZ,=0 
Zy =U, Ai Py dacy <0 y2o vo 
4, =B-Px-PPy>d Ay =O Ai Zz, =0 
Ay =O Ok —Cy¥ 20 A220 Ay Zy, = 






Suppose the utility function is af the form U = xy’. Further, let B = 100 and P, = P, 
while C = 120, c, = 2, and cy =1. 
The Lagrangian takes the specific form 





Z= xy +41(100— x— y) + a2(120— 2x - y) 


The Kuhn-Tucker conditions are now 


ZysP-dy-2 sO xz0 x2, =0 
Zya2xy-h-d2s0 y20 — yZ,=0 
2, =100-x-y=0 ABO AZ, =0 
Zyp=120-2x-y=O0 4220 AQ Z,, =0 


Again, the solution procedure involves a certain amount of trial and error. We can first 
choose one of the constraints to be nonbinding and solve for x and y. Once found, use 
these values to test if the constraint chosen to be nonbinding is violated. If it is, then redo 
the procedure choosing another constraint to be nonbinding. If violation of the nonbind- 
ing constraint occurs again, then we can assume both constraints bind and the solution is 
determined only by the constraints. 

Step 1: Assume that the second (ration) constraint is nonbinding in the solution, so that 
42 = 0 by complementary slackness. But let x, y, and 41 be positive so that complementary 
slackness would give us the follawing three equations: 


Z,=y?-4) =0 

Z)=2xy-4, =0 

Z, =100-x-y=0 
Salving for x and y yields a trial solution 


x=333 0 y= 667 


420 Part Four Optimization Problems 


However, when we substitute these solutions into the coupon constraint we find that 
2(33'/3) + 6657/3 = 1331/3 > 120 
This solution violates the coupon constraint, and must be rejected. 


Step 2: Now let us reverse the assumptions an 4; and 42 so that 2; =0, but let 
Az, X, y > 0. Then, from the marginal conditions, we have 


Z, = y* —2i2 =0 
Zy = 2xy—h2=0 
Z,, =120-2x- y=0 
Salving this system of equations yields another trial solution 
x=20 y=80 


which implies that 42 = 2xy = 3,200. These solution values, together with 1; = 0, satisfy 
both the budget and ration constraints. Thus we can accept them as the final solution to 
the Kuhn-Tucker conditions. 

This optimal solution, however, contains a curious abnormality. With the budget con- 
straint binding in the solution, we would normally expect the related Lagrange multiplier to 
be positive, yet we actually have 4; = 0. Thus, in this example, while the budget constraint 
is mathematically binding (satisfied as a strict equatity in the solution), it is economically non- 
binding (not calling for a positive marginal utility of money). 


Peak-Load Pricing 

Peak and off-peak pricing and planning problems are commonplace for firms with capacity- 
constrained production processes. Usually the firm has invested in capacity in order to 
target a primary market, However there may cxist a secondary market in which the firm can 
often sell its product. Once the capital equipment has been purchased to service the firm’s 
primary market, it is freely available (up to capacity) to be used in the secondary market. 
Typical examples include schools and universities that build to mect daytime needs (peak), 
but may offer night-school classes (off-peak); theaters that offer shows in the evening 
(peak) and matinees (off-peak); and trucking companics that have dedicated routes but 
may choose to enter “‘back-haul” markets. Since the capacity cost is a factor in the profit- 
maximizing decision for the peak market and is already paid, it normally should not be a 
factor in calculating optimal price and quantity for the smaller, off-peak market. However, 
if the secondary market's demand is close to the same size as the primary market, capacily 
constraints may be an issue, especially since it is 4 common practice to price discriminate 
and charge lower prices in off-peak periods, Even though the secondary market is smaller 
than the primary, it is possible (hat, at the lower (profit-maximizing) price, off-peak demand 
exceeds capacity. In such cases capacity choices must be made taking both markets into 
account, making the problem a classic application of nonlinear programming. 

Consider a profit-maximizing company that faces two average-revenue curves 


PpaP "Op in the day time (peak period) 
Py = P'(Q) in the night time (off-peak period) 





To operate, the firm must pay & per unit of output, whether it is day or night. Furthermore, 
the firm must purchase capacity al a cost ofc per unit of capacity. Let X denote total capacity 


Chapter 13. Further Topics in Optimization 421 


measured in units of Q. The firm must pay for capacity, regardless of whether it operates in 
the off-peak period. Who should be charged for the capacity costs: peak, off-peak, or both 
sets of customers? The firm's maximization problem becomes 


Maximize w= POi + PrQy — b(O1 + Q1) ~ eK 


an 

subjectto == 0, < K 
QsK 

where P= P!On) 
P, = P?(Q») 

and 01,02,K >0 

In view that the total revenue for Q;, 

Ri = PiQi = P'CQ)O; 


isa function of Q; alone, we can simplify the statement of the problem to 
Maximize = Ri(Q1) + Ro 2) — b(Q1 + Q2) — eK 
subject to OQsKk 
Qo eK 
and 01,02, K >9 
Note that both constraints are linear; thus the constraint qualification is satisfied and the 


Kuhn-Tucker conditions are necessary, 
The Lagrangian function is 


Z= Ri) + Ro Qo) ~ BQ) + Qo) — eK +A(K - Qi) + AK — Or} 


and the Kuhn-Tucker conditions are 





Z=MRi—b-4 <0 = =Q,)20 9,2, =0 
Z,=MR)—b-i2 <0 = Q220 QZ, =0 
Ze =—ct+hy thn <0 Kz0 KZ =0 
Zi, =K-Q)>0 M20 AZ, =O 
2, =K-G,>0 20 yh, =0 


where MR, is the marginal revenye of Q, (/ = 1,2). 
The solution procedure again entails trial and error. Let us first assume that 01, Q2, 
K > 0, Then, by complementary slackness, we have 
MR, - 6-4; =0 
MR, —6-A, =0 (13.26) 
—¢thy+a.=0 (Ay =e-A2) 
which can be condensed into two cquations after eliminating 1: 
MR, =btek (13.26 
MRp =b+A2 


Then we proceed in Lwo steps. 


422 Part Four 


FIGURE 13.5 
34 





Optimization Problems 





























bre 
' 
\ 
\ 
! 
t b 
' 
| MR; | MR, 
1 ~ ~ 
Q:<k Q K 01, On K a= Q1, On K 
(a) Off-peak constraint nonbinding (b) Off- peak constraint binding 


Step 1: Since the off-peak market is a secondary market, its marginal-revenue function 
(MR) can be expected to lie below that of the primary market (MR,) as illustrated in 
Fig, 13.5, Moreover, the capacity constraint is more likely lo be nonbinding in the 
secondary market so that 42 is more likely to be zero, So we try Ay = 0. Then (13.26') 
becomes 


MR) =b+c¢ 


13.26” 
MRy = 6 ( ) 


The fact that the primary market absorbs the entire capacity cost ¢ implies that Q) = K. 
However, we still need to check whether the constraint Q2 < K is satisfied, If so, we have 
found a valid solution. Figure 13.5(a) illustrates the case where Q) = K and Q2 < K in 
the solution. The MR, curve intersects the b + c line at point £;, and the MR, curve inter- 
sects the 4 line at point Fy. 

What if the previous trial solution entails Q. > K, as would occur if the MR, curve is 
very close to MR,. so as to intersect the } line at an output larger than A? Then, of course, 
the second constraint is violated, and we must teject the assumption of 4. = 0, and proceed 
to the next step. 

Step 2: Now let us assume both Lagrange multipliers to be positive, and thus 
OQ; = QO; = K. Then, unable to eliminate any variables from (13.26), we have 


MR, =b +A 
MR, = +A. (13.26) 
cahth 


This case is illustrated in Fig. 13.5(b), where points £) and £ satisfy the first (wo equa- 
tions in (13.26). From the third equation, we see that the capacity cost ¢ is the sum of the 
two Lagrange multipliers. This means A, and A represent the portions of the capacity cost 
borne respectively by the two markets. 





Example 2 


Chapter 13. Further Topies in Optimization 423 


Suppose the average-revenue function during peak hours is 
Py = 22-1077Q, 
and that during off-peak hours it is 
Pp = 18- 10° 5Q2 
To produce a unit of output per half-day requires a unit of capacity costing 8 cents per day. 
The cost of a unit of capacity is the same whether it is used at peak times only, or off-peak 
also. In addition to the costs of capacity, it costs 6 cents in operating costs (labor and fuel) 
to produce 1 unit per half-day (both day and evening). 
If we assume that the capacity constraint is nonbinding in the secondary market 
(42 = 0), then the given Kuhn-Tucker conditions become 
ra) 
22-2x10%Q =bte =14 
18-2x105Q =b =6 
as 
MR MC 
Solving this system gives us 
Qi = 400,000 
Q2 = 600,000 
which violates the assumption that the second constraint is nonbinding because Q2 > 


Qi=K. 
Therefore, let us assume that both constraints are binding. Then Q) = Q2 = Q and the 
Kuhn-Tucker conditions become 


A the =8 
22-2x10%Q=644) 
18-2x10°Q=6449 

which yield the following solution 
Q) = Q; = K = 500,000 
Av=60 0 4Qg=2 
Pp=V7 Pp =13 


Since the capacity constraint is binding in both markets, the primary market pays 21 = 6 of 
the capacity cost and the secondary market pays 42 = 2. 





EXERCISE 13.3 


1. Suppose in Example 2 a unit of capacity costs only 3 cents per day. 
(a) What would be the profit-maaimizing peak and off-peak prices and quantities? 
(6) What would be the values of the Lagrange multipliers? What interpretation do you 
put on their values? 
2. Aconsumer lives on an island where she produces two goods, x and y, according to the 
production possibility frontier x? + y* < 200, and she consumes all the goods herself. 
Her utility function is 


Uaxy) 


424 Part Four Optimization Problems 


The consumer also faces an environmental constraint on her total output of both 
goods. The environmental constraint is given by x + y = 20. 

(a) Write out the Kuhn-Tucker first-order conditions. 

(b) Find the consumer's optimal x and y. Identify which constraints are binding, 

3. Anelectric company is setting up a power plant in a foreign country, and it has to plan its 
capacity, The peak-period demand for power is given by P; = 400 — Qy and the off-peak 
demand is given by P2 = 380 ~ Qo. The variable cost is 20 per unit (paid in both mar- 
kets) and capacity costs 10 per unit which is only paid once and is usec! in both periods. 
(a) Write out the Lagrangian and Kuhn-Tucker conditions for this problem. 

(b) Find the optimal outputs and capacity for this problem. 

(©) How much of the capacity is paid for by each market (i.e., what are the values of 41 
and A2)? 

(d) Now suppose capacity cost is 30 cents per unit (paid only once). Find quantities, 
capacity, and how much of the capacity is paid for by each market (ie., Ay and 42), 


13.4 Sufficiency Theorems in Nonlinear Programming 





In the previous sections, we have introduced the Kuhn-Tucker conditions and illustrated 
their applications as necessary conditions in optimization problems with inequality con- 
straints. Under certain circumstances, the Kuhn-Tucker condilions can also be taken as 


sullicient conditions, 


The Kuhn-Tucker Sufficiency Theorem: Concave Programming 
In classical optimization problems, the sufficient conditions for maximum and minimum 
are traditionally expressed in terms of the signs of second-order derivatives or diffcreniials. 
As we have shown in Sec. 11.5, however, these second-order conditions are closely related 
to the concepts of concavity and convexity of the objective function. Here, in nonlinear 
programming, the sufficient conditions can also be stated directly in terms of concavity and 
convexity. And, in faci, these concepts will be applied noi only to the objective function 
fx) but to the constraint functions g!(x) as well. 
For the maximization problem, Kuhn and Tucker offer the foltowing statement of suffi- 
cient conditions (sufficiency theorem): 
Given the nonlinear programming problem 
Maximize = 7 = f(x) 
subject to gis) or (i =1,2,..., 9) 
and x20 
if the following conditions are satisficd: 
{a) the objective function f(x) is differentiable and concave in the nonnegative orthant 
(6) cach constraint fonction g(x) is differentiable and convex in the nonnegative orthant 
(c} the point x* satisfies the Kuhn-Tucker maximum conditions 


then x” gives a global maximum of x = f(x). 


Note that, in this theorem, the constraint qualification is nowhere mentioned. This is 
because we have alrcady assumed, in condition (c), that the Kuhn-Tucker conditions are 


Chapter 13. Hurther Topics in Optimization 425 


satisfied at x* and, consequently, the question of the constraint qualification is no longer 
an issue. 

As it stands, the above theorem indicates that conditions (a), (6), and (c) are sufficient to 
establish x* to be an optimal solution. Looking at it differently, however, we may also in- 
terpret it to mean that given (a) and (6), then the Kuhn-Tucker maximum conditions are 
sufficient for a maximum. In the preceding section, we Icarned that the Kuhn-Tucker con- 
ditions, though not necessary per se, become necessary when the constraint qualification 
is satisfied. Combining this information with the sufficiency thcorem, we may now state 
that if the constraint qualification is satisfied and if conditions (a) and (5) are realized, then 
the Kuhn-Tucker maximum conditions will be necessary-and-sufficient for a maximum. 
This would be the case, for instance, when all the constraints are linear inequalities, which 
is sufficient for satisfying the constraint qualification. 

The maximization problem dealt with in the sufficiency theorem above is often referred to 
as concave programming. This name arises because Kuhn and Tucker adopt the > inequality 
instead of the < inequality in every constraint, so that condition (6) would require the g(x} 
functions to be aif concave, like the f(x) function. But we have modified the formulation in 
order to convey the idea that in a maximization probicm, a constraint is imposed to “rein in” 
(hence, <) the altempt to ascend to higher points on the objective function. Though different 
in form, the two formulations are equivalent in substance. For brevity, we omit the proof. 

As stated above, the sufficiency theorem deals only with maximization problems. But 
adaptation to minimization problems is by no means dillicult. Aside from the appropriate 
changes in the theorem to reflect the reversal of the problem itsclf, all we have to do is to 
interchange the two words concave and convex in conditions (a) and (4) and to use the 
Kuhn-Tucker minimum conditions in condition (c). (See Exercise 13.4-1.) 








The Arrow-Enthoven Sufficiency Theorem: 

Quasiconcave Programming 

To apply the Kuhn-Tucker sufficiency theorem, certain concavity-convexity specifications 
must be met. These constitute quite stringent requirements. In another sufficiency theorem 
the Arrow-Enthoven sufficiency theorem’—these specifications ure relaxed to the extent of 
requiring only quasiconcavity and quasiconvexity in the objective and constraint functions. 
With the requirements thus weakened, the scope of applicability of the sufficient conditions 
is correspondingly widened. 

In the original formulation of the Arrow-Enthoven paper, with a maximization problem 
and with constraints in the > form, the f(x) and g'(x) functions must uniformly be quasi- 
concave in order for their theorem to be applicable. This gives rise to the name quasiconcave 
programming, In our discussion here, however, we shall again use the < inequality in the 
constraints of a maximization problem and the > inequality in the minimization problem. 

The theorem is as follows: 

Given the nonlincar programming problem 


Maximize = = f(x) 
subject to gis) <r (i =1,2,...,m) 
and x=0 


* Kenneth |. Arrow and Alain C, Enthoven, “Quasi-concave Programming,” Econometrica, October, 
1961, pp. 779-800. 


426 


Part Four Optinaization Problems 


if the following conditions are satisfied: 


(a) the objective function f(x) is differentiable and quasiconcave in (he nonnegative 
orthant 
(6) each constraint function g'(x) is differentiable and quasiconvex in the nonnegative 
orthant 
(c) the point x* satisfies the Kuhn-Tucker maximum conditions 
(d) any one of the following is satisfied: 
(di) fir") < 0 for at least one variable x; 
(dit) f(c*) > 0 for some variable x; that can take on a positive value without 
violating the constraints 
(d-ii#} the # derivatives f;(x") are not all zero, and the function /(x) is twice 
differentiable in the neighborhood of x* [i.c., all the second-order partial 
derivatives of f(x) exist at x*] 
{d-iv') the function f(x) is concave 


then x” gives a global maximum of x = f(x). 

Since the proof of this theorem is quite lengthy, we shall omit it here. However, we do 
want to call your attention to a few important features of this theorem. For one thing, while 
Arrow and Enthoven haye succeeded in weakening the concavity-convexity specifications 
to their quasiconcavity-quasiconvexity counterparts, they find it necessary to append a new 
requirement, (d). Note, though, that only one of the four alternatives listed under (d) is 
required to form a complete sct of sufficient conditions. In effect, therefore, the above 
theorem contains as many as four different sets of sufficient conditions for a maximum. 
In the case of (d-iv), with f(x) concave, it would apear that the Arrow-Enthoven suffi- 
ciency theorem becomes identical with the Kubn-lucker sufficiency theorem. But this is 
nol truc. Inasmuch as Arrow and Enthoven only require the constraint functions g(x) to be 
guasiconvex, their sufficient conditions are still weaker. 

As stated, the theorem lumps together the conditions (a) through (d) as a set of sufficient 
conditions, But it is also possible to interpret it to mean that, when (a), (b), and (d) are sat 
isfied, then the Kuhn-Tuckcr maximum conditions become sufficient conditions for a max- 
imum. Furthermore, if the constraint qualification is also satisfied, then the Kuhn-Tucker 
conditions will become necessary-and-sofficient for a maximum. 

Like the Kub-Tucker theorem, the Arrow-Enthoven theorem can be adapted with ease 
to the atinimization framework. Aside [rom the obvious changes that are needed to reverse 
the direction of optimization, we simply have to interchange the words quasiconcave and 
quasiconvex in conditions (a) and (6), replace the Kuhn-Tucker maximum conditions by 
the minimum conditions, reverse the inequalities in (d-/) and (d-if), and change the word 
concave to convex in (d-iv). 


A Constraint-Qualification Test 

It was mentioned in Sec. 13.2 that if all constraint functions are lincar, then the constraint 
qualification is satisficd. In case the g(x) functions are nonlinear, the following test offered 
by Arrow and Enthoven may prove useful in determining whether the constraint qualifica- 
tion is satisfied: 


Chapter 13. Fursher Topics in Optimisation 427 


For a maximization problem, if 


(a) every constraint function g‘(x) is differentiable and quasiconvex 
(h) there exists a point.x° in the nonnegative orthant such that alt (he constraints are sat- 
isfied as strict inequalities at x° 
(ce) one of the following is true: 
(c-f) every g'(x) function is convex 
(e-di) the partial detivatives of every g'(x) are not all zero when cvaluated at every 
point x in the feasible region 


then the constraint qualification is satisfied. 
Again, this test can be adapted to the minimization problem with case. To do so, just change 
the word quasiconvex to quasiconcave in condition (a), and change the word convex to 
concave in (c-i). 





EXERCISE 13.4 


1. Given: Minimize C = F(x) 
subiectto  Giixya ny 
and x>0 

(@) Convert it into a maximization prablem. 


(B) What in the present problem are the equivalents of the f and g’ functions in the 
Kuhn-Tucker sufficiency theorem? 


(c) Hence, what concavity-convexity conditions should be placed on the F and G! 
functions to make the sufficient conditions for a maximum applicable here? 


(d) On the basis of the above, how would you state the Kuhn-Tucker sufficient condi- 
tions for a minimum? 


2. Is the Kuhn-Tucker sufficiency theorem applicable to: 
(a) Maximize n= Xx 
subjectto xP +d <1 
and x, x2 20 
(b) Minimize = C = (x) - 3 +00 - 4? 
subject to xy Fx = 4 
and x, 20 
(Q Minimize C= 2x +x 
subjectto ag —4m) +42 >0 
and m1, 42 20 
3. Which of the following functions are mathematically acceptable as the objective 
function of a maximization problem which qualifies for the application of the Arrow- 
Enthoven sufficiency theorem? 
(@) FQ =x? -2x 
(b) £04, x2) = 6x1 — 9x 
(co) Fx, X2} = x2 — In (Note: See Exercise 12.4-4.) 





428 Part Four Optimization Problems 


4. Is the Arrow-Enthoven constraint qualification satisfied, given that the constraints of a 
maximization problem are: 


(a) 8+ (02-5) $4 and 5a) +4 < 10 
(BD) x) tq 38 and —xyx25-8 — (Note: —xX, x2 Is Not convex.) 


13.5 Maximum-Value Functions and the Envelope Theorem! 





A maximum-value function is an objective function where the choice variables have been 
assigned theit optimal values. These optimal values of the choice variables are, in turn, 
functions of the exogenous variables and parameters of the problem. Once the optimal val- 
ues of the choice variables have been substituted into the original objective function, the 
function indirectly becomes a function of the parameters only (through the parameters’ in- 
fluence on the optimal values of the choice variables), Thus the maximum-value function is 
also referred to as the indirect objective function. 


The Envelope Theorem for Unconstrained Optimization 
What is the significance of the indirect objective function? Consider that in any optimiza- 
tion problem the direct objective function is maximized (or minimized) for a given set of 
parameters. The indirect objective function traces out all the maximum values of the ob- 
jective function as these parameters vary. Hence the indirect objective function is an 
“envelope” of the set of optimized objective functions generated by varying the parameters 
of the model, For most students of economics the first illustration of this notion of an 
envelope arises in the comparison of short-run and long-run cost curves. Students are typ- 
ically taught that the long-run average cost curve is an envelope of all the short-run average 
cost curves (what parameter is varying along the envelope in this case?). A formal deriva- 
tion of this concept is onc of the exercises we will be doing in this section. 

To illustrate, consider the following pnconstrained maximization problem with two 
choice variables x and v and one parameter ¢: 


Maximize =U = f(x. ». 6) (13.27) 
The first-order necessary condition is 
Arey) = fylrs ys b) = 0 (13.28) 
If second-order conditions are met, these two equations implicitly define the solutions 
x =xo) y= y"(9) (13.29) 
If we substitute these solutions into the objective function, we obtain a new function 
FQ) =f. YO), 9) (13.30) 


where this function is the value of f when the values of x and y are those that maximize 
1, y. @). Therefore,  (@) is the maximum-value function (or indirect objective function). 


* This section of the chapter presents an overview of the envelope theorem. A richer Lreatment of this 
topic can be found in Chap. 7 of The Structure of Economics: A Mathernatical Analysis (3rd ed.) by 
Eugene Silberberg and Wing Suen (McGraw-Hill, 2001) on which parts of this section are based. 


Chapter 13. Further Topics in Optimization 429 


If we differentiate V with respect to @, its only argument, we get 

dV ax® ays 

—=f-—+fA—t+h 13.31 

a Sigg theag th (13.31) 
However, from the first-order conditions we know f, = Jy = 0. Therefore, the first two 
terms disappear and the result becomes 


W , 
ant (13.31') 


This result says that, at the optimum, as @ varies, with x* and y* allowed to adjust, the 
derivative dV /d@ gives the same result as if x* and y* are treated as constants. Note that 
enters the maximum-yalue function (13.30) in three places: one direct and two indirect 
(through x* and y*). Equation (13.31') shows that, at the optimum, only the direct effect of 
¢ on the objective function matters. This is the essence of the envelope theorem, The enve- 
lope theorem says that only the direct effects of a change in an exogenous variable need be 
considered, even though the exogenous variable may also enter the maximum-value func- 
tion indirectly as part of the solution to the endogenous choice variables. 


The Profit Function 
Let us now apply the notion of the maximum-value function to derive the profit function of 
a competitive firm. Consider the case where a firm uses two inputs: capital XK and labor L, 
The profit function is 

m= P{{K,L)-wh -rKk (13,32) 
where P is the output price and w and r are the wage rate and rental rate, respectively. 

The first-order conditions are 
m = PAK, L)—w=0 


: (13.33) 
tq = Pfg(K,L)—r =0 
which respectively define the input-demand equations 
L* = Litw,r, P 
mene) (13.34) 
K* = K*(w.r, P) 
Substituting the solutions K* and L* into the objective function gives us 
(wer, P) = PA(K*, L°) —wh* —r K* (13.35) 


where 2*(w, r, P) is the profit fiction (an indirect objective function). The profit function 
gives the maximum profit as a function of the exogenous variables w, r, and P. 

Now consider the effect of a change in w on the firm's profits. [f we differentiate 
the original profil function (13.32) with respect to w, holding all other variables constant, 
we get 

on 

ow 
However, this result does not take into account the profit-maximizing firm’s ability to make 
a substitution of capital for labor and adjust the level of output in accordance with profit- 
maximizing behavior. 


=-L (13.36) 


430° Part Four Optimization Problems 


In contrast, since *(w, 7, P) is the maximum value of profits for any values of w, 7, and 

P, changes in z* from a change in w takes all capital-for-labor substitutions into account. 
To evaluate a change in the maximum profit function caused by a change in w,, we difler- 
entiate z*(w,r, P) with respect to w' to obtain 

ant + * 

9 (Pf, w+ Pfc 5 

aw ow dw 
From the first-order conditions (13.33), the two terms in parentheses are equal to zero. 
Therefore, the equation becomes 


-L* (13.37) 





‘art 
ont = -L*(w,r, P) (13.38) 
aw 
This result says that, at the profit-maximizing position, a change in profits with respect to a 
change in the wage rate is the same whether or not the factors arc held constant or allowed 
to vary as the factor price changes. In this case, (13.38) shows that the derivative of the 
profit function with respect to w is the negative of the factor demand function £*(1, 1, 7). 
Following the preceding procedure, we can also show the additional comparative-static 


results: 
an*(w.r P 
een *) = =K*v, 1, P) (13.39) 
' 
at(w.r, P 
and cent) = (KYL) (13.40) 


Equations (13.38), (13.39), and (13.40) are collectively known as Hotelling's lemma. We 
have obtained thesc comparative-static derivatives from the profit [unction by allowing K* 
and /.* to adjust to any parameter change. But it is easy to sec thal the same results will 
emerge if we differentiate the profit function (13.35) with respect to each parameter while 
holding K* and £* constant, Thus Hotelling’s lemma is simply another manifestation of the 
envelope theorem that we encountered earlier in (13.31'). 


Reciprocity Condition 
Consider again our two-variable unconstrained maximization problem 
Maximize U= f(x,y. 9) [from (13.27}] 


where x and y are the choice variables and @ is a parameter. The first-order conditions are 
fe = fy =9, which imply x* = x*(¢) and y* = »*(@). 

We arc inlerested in the comparative statics regarding the directions of change in.x"(@) and 
y"(@) as @ changes and the effects on the valuc function. The maximum-vatue function is 


¥(O) = 0°) VOY) (13.41) 


By definition, (¢) gives the maximum value of f for any given ¢. 
Now consider a new function that depicts the difference between the actual value and the 
maximum value of U: 


Q(x, yO) = F(x, yb) — VO) (13.42) 


This new function 2 has a maximum value of zero when x = * and y = v*; for any 
x fx" y &" wehave f < V. In this framework Q(x, y, @) can be considered a function 


Chapter 13 further Tupics in Optimization 431 


of three independent variables, x, y, and @. The maximum of Q(x, y, 6) = f(a, ¥.6) - 
¥(@) can be determined by the first- and second-order conditions. 
The first-order conditions are 


O00,» 0) = fe = 
QC pd) = fy = 
and Q6(0 0) = fy — Ke =O (13.44) 
We can see that the first-order conditions of our new function Q in (13.43) are nothing but 
the original maximum conditions for f(x, ¥,@) in (13.28), whereas the condition in 
(13.44) really restates the envelope theorem (13.31’). These first-order conditions hold 


whenever x = x*() and y = y*(@), The second-order sullicient conditions are satisfied if 
the Hessian of Q 


(13.43) 


Sex fey Te 
H=| fx fw fra 
tix fox fea — Yoo | 


is characterized by 
fe <0 fafy—fy> 0 H<O 


In detiving the Hessian above, we listed the variables in the order (x, y. @) and, conse- 
quently, the first entry in the second-order conditions, (Q,, =) fry < 0 relates to the vari- 
able x. Had we adopted an alternative listing ordcr, then the first entry could have been 
Qyy = fry <0, or 





Qe = fog — Veg <0 (13.45) 


[t turns out that (13.45) can lead us to a result that provides a quick way to reach a 
comparalive-static conclusion. First, we know from (13.41) that 


Vo) = fol0"(d), ¥°(9), 9) 


Differentiating both sides with respect to # yields 


ay” 
= foes + for ap + feo (13.46) 
Using (13.45) and Young’s theorem, we can write 
ax* 
Yoo — fas = hoa ot fy 9 >0 (13.47) 


Suppose that @ enters only in the first-order condition for x, such that f,4 = 0. Then 
(13.47) reduces to 


ax* 
Sway >0 (13.48) 
which implies that f, and 4x*/3@ will have the same sign. Thus, whenever we see the 


parameter @ appearing only in the first-order condition relating to x, and once we have 
determined the sign of the derivative f,5 from the objective function U = f(x, y, 6), 


432. Part Four Optimization Problems 


we can immediately tell the sign of the comparative-static derivative dx“/0¢) without 
further ado. 
For example, in the profit-maximization model: 


w= Pf(K,Ly)—wh -¢K 
where the first-order conditions are 
wy =Pfp-w=0 
ag = Pfg-r =0 
the exogenous variable w enters only the first-order condition Pf, ~ w = 0, with 
Om, 
Therefore, by (13.48), we can conclude that 0L"/dw will also be negative. 
Further, if we combine the envelope theorem with Young’s theorem, we can derive a re- 


jation known as the reciprocity condition: OL” /dr = 4K*/Aw. From the indirect profit 
function z*(w. ¢. P), Hotelling’s lemma gives us 


=-l 


. 
w= = —l (wir, PY 














2 at aR | 
Fie a yg en 
aL’ aKt 
or —s (13.49) 
ar ow 


This result is referred to as the reciprocity condition because it shows the symmetry 
between the comparative-static cross effect produced by the price of one input on the 
demand for the “other” input, Specifically, in the comparative-stutic sense, the effect of r 
{the rental rate for capital K) on the optimal demand for labor /. is the same as (he effect of 
w (the wage rate for labor /.) on the optimal demand for capital K. 


The Envelope Theorem for Constrained Optimization 


The envelope theorem can alse be derived for the case of constrained optimization. Again 
we will have an objective function (L’), two choice variables (x and y) and one parameter 
(#}: except now we introduce the following constraint: 


ax, y:@} = 0 
The problem becomes: 
Maximize U = f(x,y: ¢) 
subjectto gir. vi ¢) =0 





(13.50) 


The Lagrangian for this problem is 


Z= f(x vi G+ Al0— ex, yi @)] (13.51) 


Chapter 13. Further Topics in Optimization 433 


with first-order conditions 

Z.= fg, =0 

Z, = fy — ig, =0 

4, = glx) #) = 0 
Solving this system of cquations gives us 

sar) yay'@) =A) 
Substituting the solutions into the objective function, we get 
U" = £0"). V(b) 0) = V8) (13.52) 


where V() is the indirect objective function, a maximum-valye function. This is the max- 
imum value of y for any ¢ and x;’s that satisfy the constraint. 
How docs V(@) change as ¢ changes? First, we diflercntiate V with respect to @: 





on fe (13.53) 


In this case, however, (13.53) will not simplify to dV /d@ = fy since in constrained opti- 
mization, it is not necessary to have f, =f, = 0 (see Table 12.1). But if we substitute the 
solutions to x and y into the constraint (producing an identity), we get 


ale"(d), v"(G). 6) = 0 
and differentiating this with respect to ¢ yields 


oa" ay" 
g,-— + 2, =0 13.54 
Brag + 8a + % (13.54) 
If we multiply (13.54) by A, combine the result with (13.53), and rearrange terms, we get 


dV ox* byt 
= — Age —dgy 
dé h eG + B56 
where Z, is the partial derivative of the Lagrangian function with respect to ¢, holding all 
other variables constant. This result is in the same spirit as (13.31), and by virtue of the 
first-order conditions, it reduces to 





+ fo — Age = Zy (13.55) 


oA = Zy (13.55') 
which represents he envelope theorem in the framework of constrained optimization. Note, 
however, in the present case, the Lagrangian function replaces the objective function in de- 
riving the indirect objective function. 

While the results in (13.55) nicely parallel the unconstrained case, it is important to note 
that some of the comparative-static resulls depend critically on whether the parameters 
enter only the objective function, or only the constraints, or enter both. If a parameter en- 
ters only in the objective function, then the comparative-static resulls are the same as for 
the unconstrained case. Ilowever, if the parameter enters the constraint, the relation 


Vos > fos 
will no longer hold, 


434° Part Four Uptimization Problems 


Interpretation of the Lagrange Multiplier 

In the consumer choice problem in Chap. 12 we derived the result that the Lagrange multi- 
plier 4 represented the change in the value of the Lagrange function when the consume! 
budget changed. We interpreted 4 as the marginal utility of income. Now let us derive a 
more general interpretation of the Lagrange multiplier with the assistance of the envelope 
theorem. Consider the problem 





Maximize =U = f(x, ») 
subject to gx, y)=e 
where c is a constant. The Lagrangian for this problem is 
Z = flxyy+ Me - gl yy (13.56) 
The first-order conditions are 
Zo = felt, ¥)— Agel, ¥) =O 
Zy = ful®,¥) — Age, ») = 0 (13.57) 
Z,=0- 2x, y)=0 
From the first two equations in (13.57), we get 
AS & = be (13.58) 
Be By 
which gives us the condition that the slope of the level curve (indifference curve) of the 


objective function must equal the slope of the constraint at the optimum. 
Equations (13.57) implicitly define the solutions 


wert) vray(e) Mao) (13.59) 








Substituting (13.59) back into the Lagrangian yields the maximum-value function, 
Vee) = Ze) = Os PMO) FANOLE — 270, OT (13.60) 
Differentiating with respect to ¢ yields 
dv d2z* 


= eg pO a fe ae'tar, ye 
Tene alee tig tle ee PON 


an 
de 
yep, OF ae ay* ya, de 
— OB Te Mea Et cle 
By rearranging we get 


dz 
de 





* ax* . ayt ont ‘ 
=f — Red +f — eI 4+ leet ed 
de dc dc 


By (13.57), the three terms in brackets are all equal to zero. Therefore this expression 
simplities to 
dv _ dz* 


de de 


which shows that the optimal value 4* measures the rate of change of the maximum 
value of the objective function when ¢ changes, and is for this reason referred to as the 








(13.61) 


Chapter 13. Purther Topics in Optimization 435 


“shadow price” of ¢. Note that, in this case, ¢ enters the problem only through the 
constraint; it is not an argument of the original objective function. 


13.6 Duality and the Envelope Theorem 





A consumer's expenditure function and his or her indirect utility function exemplify the 
minimum- and maximum-value functions for dual problems.' An expenditure function 
specifies the minimum expenditure required to obtain a fixed level of utility given the util- 
ity function and the prices of consumption goods. An indirect utility function specifies the 
maximum utility that can be obtained given prices, income. and the utility function. 


The Primal Problem 


Let U(x, y) be a utility function where x and v are consumption goods, The consumer has 
a budget B and faces market prices P, and P, for goods x and y, respectively, This problem 
will be considered the primal problem: 

Maximize U=Ut,y) 
. [Primal] (13.62) 
subject to PextPy=B 


For this problem, we have the familiar Lagrangian 
Z = U(x, y) +B - Pex — Pyy) 
The first-order conditions are 
2, =U, -AP, =0 
Z. =U, -AP, = 0 (13.63) 
Z,=B-P.x—P.y =0 
This system of equations implicitly defines a solution for x”, y", and 4” as a function of 
the exogenous variables B, P,. P,: 
x" =x" Pe, Py, B) 
vm = "(Pry Py, B) 
A SAM Pe, Py, BY 
The solutions x” and y” are the consumer’s ordinary demand functions, sometimes called 
the “Marshallian” demand functions, hence the superscript m. 
Substituting the solutions x and y into the utility function yields 
U* = U(x" (Pe, Py, Bly", Py, BY) = (PL, PB) (13.64) 


where V is the indirect utility function—a maximum value function showing the maximum 
attainable utility in problem (13.62). We shall rcturn to this function later. 


* Duality in economic theory is the relationship between two constrained oplimization problems. If 
one of the problems requires constrained maximization, the other problem will require constrained 
Minimization. The structure and solution of either problem can provide information about the 
structure and solution of the other problem. 


436 Part Four Optimization Problems 


The Dual Problem 
Now consider a related dual problem for the consumer with the objective of minimizing the 
expenditure on x and y while maintaining a fixcd utility level L'” derived from (13.64) of 
the primal problem: 
Minimize E=Px4+ Py [Dual (13.65) 
subject to U(x, y) = Ut 
Its Lagrangian is 
Zia Px t PythllT—Ue yi] 
and the first-order conditions are 
Zi=P, wl, =0 
Zo = P,~ wl, =0 (13.66) 
Zi=U'— Ux, y) =0 
This system of equations implicitly defines a set of solution values to be labeled x", y*, 
and at: 
wl ax(P. PU") 
ya W(Pe Py U") 
gl = ph(P, Py, UD) 
Here x" and y" are the compensated (“real income” held constant) demand functions. They 


are commonly referred to as “Hicksian” demand functions, henee the # superscript. 
Substituting x* and y* into the objective function of the dual problem yiclds 


Pex"(P., Pu UV + Prvi( Py, Pp, U*) &S E(Py, Py U) (13.67) 
where E is the expenditure function —a minimum-value function showing the minimum 


expenditure needed to attain the utility level U*. 


Duality 
If we take the first two equations in (13.63) and in (13.64), and eliminate the Lagrange 
multipliers, we can write 





Px oU 
Py Uy 
This is the tangency condition in which the consumer chooses the optimal bundle where the 
slope of the indifference curve equals the slope of the budget constraint, The tangency con- 
dition is identical for both problems. Thus, when the target level of utility in the minimiza- 
tion problem is sct equal to the value U* obtained from the maximization problem, we get 


(13.68) 








x(Py, Po, BY) =x" Pe, Py, U) 


13.69 
(Pos Pry BY = (Pe, Pes 0) Oe 


i.., the solutions to both the maximization problem and the minimization problem produce 
identical values for x and y. However, the solutions are functions of different exogenous 
variables, so comparative-static exercises will generally produce different results. 


Chapter 13. Murther Tupics in Optimization 437 


The fact that the solution values for x and y in the primal and dual problems are deter- 
mined by the tangency point of the same indifference curve and budget-constraint line 
means that the minimized expenditure in the dual problem is equal to the given budget 8 of 
the primal problem: 


E(P,, P,, U) = B (13,70) 


This result is parallel to the result in (13.64), which reveals that the maximized value of util- 
ity Vin the primal problem is equal to the given target level of utility U* in the dual problem. 

While the solution values of x and y arc identical in the two problems, the same cannot 
be said about the Lagrange multipliers. From the first equation in (13.63) and in (13.66), we 
can calculate A = U,/ Pr, but ue = P,/U,. Thus, the solution values of 4 and je are recip- 
rocal lo cach other: 


A= or AT = (13.71) 
bb i 


Roy’s Identity 
One application of the envelope theorem is the derivation of Roy’s identity. Roy’s identity 
states that the individual consumer's Marshallian demand function is equal to negative of 
the ratio of two partial derivatives of the maximum-value function. 

Substituting the optimal values x”, v”, and 4” into the Lagrangian of (13.62) gives us 


VP, Py, BY = U(x", yi") 4" (B — Pex” — Pry") (13.72) 
When we differentiate (13.72) with respect to P, we find 


ay an ayn 
(Uy WMP) + (Uy — AP 




















oP, aPy aP, 
+(B- Px" -P, ym er 
“ aR, 
At the optimum, the first-order conditions (13.63) enable us to simplify this to 
ap, . 
Next, differentiate the value function with respect to B to get 
av ax™ ay" 
— = (U, - Py $ (Uy — WP, 
aa =! vyg PU MPN ae 
m aa" a 
+ (B= Pct — Py) ba" 
Again, at the optimum, (13.63) enables us to simplily this to 
av a 
OB 
By taking the ratio of these two partial derivatives, we find that 
av sap, 
=-x" 13.73 
av 7B (13.73) 


This result, known as Roy's identity, shows that the Marshallian demand for commodity x 
is the negative of the ratio of two partial derivatives of the maximum-value function V with 


438 Part Four Optimization Problems 


respect to P, and B, respectively. In view of the symmetry between x and v in the problem, 
a result similar to (13.73) can also be written for y”, the Marshallian demand for y. Of 
course, this result could be arrived at directly by applying the envelope theorem. 


Shephard’s Lemma 
In Sec. 13.5, we derived Hotclling’s lemma, which states that the partial derivatives of 
the maximum valuc of the profit function yields the firm’s input-demand functions and the 
supply functions. A similar approach applied to the expenditure function yields Shephard’s 
lemma. 

Consider the consumer’s minimization problem (13.65). The Lagrangian is 

2 =P.xt Py t ull" - UG, yy 

From the first-order conditions, the following solutions are implicitly defined 


wt aw"(P, Py U*) 
y= yi(Py Po U) 
wh =p P,P U) 
Substituting these solutions into the Lagrangian yields the expenditure function: 
E( Peg Po, U*) = Pot + By + atu? — Us v] 


Taking the partial derivatives of this function with respect to #, and P, and evaluating them 
at the optimum, we find that 9£/4P, and 8£/9P, represent the consumer’s ITicksian 
demands: 


aE ax! 
— =(P, — plU, 
(Po BP, 





ay’ ‘ byt , 
+P) — pUy) op te OVS 44° 





aP, 
at ay ; 
= = (0), p, + Ox, + oe taxes (13.74) 
and 
ak _ boi *—upck py tle yh 
we. =(P, = hu) S408, Uap HU UR", Map, 
ax" i. aut 
= 55 + OaE + Ogg, +yh ay" (13.74) 


Finally, differentiating F with respect to the constraint U* yields j.”, the marginal cost of 
the constraint 
4 yh 
aE pp OF 


gue alte 








+(Py - 








a 
+ [0 —Utt y" uw co tH 


oe 4 2 Hr hag (13.74") 
0 Oe Oe en =e 











Example 1 


Chapter 13 Further Tapics in Optimization 439 


Together, the three partial derivatives (13.74), (13.74'), and (13.74”) are referred to as 
Shephard’s lemma, 


Consider a consumer with the utility function U = xy, who faces a budget constraint of B 
and Is given prices P, and Py. 
The choice problem is 


Maximize =U = xy 
subjectto — Pyx + Pyy = B 
The Lagrangian for this problem is 
Z= y+ A(B — Px ~ Pyy) 
The first-order conditions are 
Z=y—-2P, =0 
Zy=x—}Py=0 
2 =B- Px ~ Pyy=0 
Solving the first-order conditions yields the following solutions: 


ym 8 re 8 3 ft a 8 
2P, Py 





“3p, % =3p, * 
where x” and y” are the consumer’s Marshallian demand functions. For the second-order 
condition, since the bordered Hessian is 


0 1 =P 
Aj=| 1 0 =P] =2P,P, 50 
—Py —Py 0 





the solution does represent a maximum.* 
We can now derive the indirect utility function for this problem by substituting x” and 
y¥ into the utility function: 


V(P,, Py, B) 8 5 )- eo (13.75) 
ee” am) 2Py) 4P Py ‘ 

where V denotes the maximized utility. Since V represents the maximized utility, we can set 
V = U* in (13.75) to get 82/4, Py = U*, and then rearrange terms to express Bas 


8 = (4P, Py? = apl? pl2ya2 





Now, think of the consumer's dual problem of expenditure minimization. In the dual 
problem, the minimum-expenditure function & should be equal to the given budget 
amount 8 of the primal problem, Therefore, we can immediately conclude from the pre- 
ceding equation that 

E(Po, Py, Ut) = B= 2P 2 ply (13.76) 
* Note that the bordered Hessian is written here (and in Example 2 on page 440) with the borders in 
the third row and column, instead of in the first row and column as in (12.19). This is the result of 
listing the Lagrange multiplier as the last rather than the first variable as we did in previous chapters. 
Exercise 12.3-3 shows that the two alternative expressions for the bordered Hessian are transformable 
into each other by elementary row operations without affecting its value. However, when more than 
two choice variables appear in a problem, itis preferable to use the (12.19) format because that 
makes it easier to write out the bordered leading principal minors, 


440 Part Four Opsimization Problems 


Example 2 


Let’s now use this example to verify Roy’s identity (13.73) 








OVOP, 

= ae 
Taking the relevant partial derivatives of V, we find 
ay BP 

ap, 4P?P, 
and ay 8 
aB IPP, 


The negative of the ratio of these (wo partials is 


av ( B ) 
ap, __ \4P2P, By 


Tar Bye 
aB IPP, 


Thus we find that Roy’s identity docs hold. 





Now consider the dual problem of cost minimization given a fixed level of utility related to 
Example 1. Letting U* denote the target level of utility, the problem is: 


Minimize Pex + Pyy 
subjectto = xy =U" 
The Lagrangian for the problem is 
Zo = Pykd Pyyt w(U* — xy} 


The first-order conditions are 


Zo = Py—py=0 
Za Py—px=0 
rd ge - 
Zi, = UY ~xy=0 


Solving the system of equations for x, y, and «, we get 
1 
* 
We (¢ in 
Px 


a 
y= (7 y (13.77) 
Py 


1 
P,Py\? 
nf fety 
" =( uF ) 
where x" and y" are the consumer's compensated (Hicksian) demand functions. Checking 
the second-order condition for a minimum, we find 
— | 0 -n -y 
|H|=|-2 @ -x 
|-y -x 0 











—2xyp <9 





Thus the sufficient condition for a minimum is satisfied. 


Chapter 13 Further Topics in Optimization 441 


Substituting x" and y” into the original objective function gives us the minimum-value 
function, or expenditure function 


Bourn 12 P,u* VW2 
B= Pats Pyyh =e, (202) 40, (20) 


fi P, 





= (PePyU)? 4 (Py PU)? 
= apy preys? (13.76’) 


Note that this result is identical with (13.76) in Example 1. The only difference lies in the 
process used to derive the result. Equation (13.76') is obtained directly from an expenditure- 
minimization problem, whereas (13.76) is indirectly deduced, via the duality relationship, 
fram a utility-maximization problem. 

We shall now use this example to test the validity of Shephard’s lemma (13.74), (13.74’), 
and (13.74”}. Differentiating the expenditure function in (13.76) with respect to P,, P,, 
and U', respectively, and relating the resulting partial derivatives to (13.77), we find 


DE(P,, Py, UX) Py Put? 








aR, 
AE(Pe, Py U*) _ 

apy ~ 
BECP, Py Ut) Pyl*phe 
Saye = pee 


Thus, Shephard’s Lemma holds in this example. 





EXERCISE 13.6 


1. A consumer has the following utility function: U(x, y) = x(y+1), where x and y are 
quantities of two consumption goods whose prices are P, and Py, respectively. The 
consumer also has a budget of 8 Therefore, the consumer's Lagrangian {5 


x(y + 1] - AB ~ Pyx — Pyy) 


(a) From the first-order conditions find expressions for the demand functions. What 
kind of good is y? In particular what happens when P, > 8? 


(b) Verify that this is a maximum by checking the second-order conditions. By substi- 
tuting x* and y* into the utility function, find an expression for the indirect utility 
function 


U" = U(Py, Py, 8) 
and derive an expression for the expenditure function 
E = E(P,, Py, U*) 
(Q This problem coutd be recast as the following dual problem 
Minimize Pyxt Pyy 
subject to x(y+1) =U 


Find the values of x and y that solve this minimization problem and shaw that the 
values of x and y are equal to the partial derivatives of the expenditure function, 
OE /AP, and d£ /aP,, respectively. 


442 Part Four Optimization Problems 


13.7 Some Concluding Remarks 





In the present part of the book, we have covered the basic techniques of optimization. The 
somewhat arduous journey has laken us (1) from the case of a single choice variable to the 
more general #-variable case, (2) from the polynomial objective function to the exponcntial 
and logarithmic, and (3) from the unconstrained to the constrained variety of extremum. 

Most of this discussion consists of the “classical” methods of optimization, with differ- 
ential calculus as the mainstay, and derivatives of various orders as the primary tools. One 
weakness of the calculus approach to optimization is its essentially myopic nature. While 
the first- and second-order conditions in terms of derivatives or differentials can normally 
locate relative or local extrema without difficulty, additional information or further investi- 
gation is often required for identification of absolute or global extrema. Our detailed dis- 
cussion of concavity, convexity, quasiconcavity, and quasiconvexity is intended as a useful 
stepping-stone ftom the realm of relative extrema to that of absolute ones. 

A more serious limitation of the calculus approach is its inability to cope with con- 
straints in the inequality form. For this reason, the budget constraint in the utility- 
maximization model, for instance, is stated in the form that the total expenditure be exactly 
egual to (and not “less than or equal to”) a specified sum. Ln other words, the limitation of 
the calculus approach makes it necessary to deny the consumer the option of saving part of 
the available funds. And, for the same reason, the approach does nol allow us to 
specify explicitly that the choice variables must be nonnegative as is appropriate in most 
economic analysis. 

Fortunately, we are liberated from these limitations when we introduce the modem 
optimization technique known as nonlinear programming. Here we can openly admit in- 
equality constraints, including nonnegativity restrictions on the choice variables, into the 
problem. This obviously represents a giant step forward in the development of optimization 
methodology. 

Still, even in nonlinear programming, the analytical framework remains static. The 
problem and its solution relate only to the optimal state at onc point of time and cannot ad- 
dress the question of how an optimizing agent should, under given circumstances, behave 
over a period of time. The latter question pertains to the realm of dynamic optimization, 
which we arc unable to handle until we have learned the basics of dynamic analysis -the 
analysis of movements of variables over time. In fact, aside from its application to dynamic 
optimization, dynamic analysis js, in itself, an important branch of economic analysis. For 
this reason, we shall now turn our attention to the subject of dynamic analysis in Part 5. 





Part 5 
Dynamic Analysis 








Chapter 

















Economic Dynamics 
and Integral Calculus 


The term dynamics, as applied to economic analysis, has had different meanings at differ- 
ent times and for different economists." In standard usage today, however, the term refers to 
the type of analysis in which the object is cither to trace and study the specific time paths 
of the variables or to determine whether, given sufficient time, these variables will tend to 
converge to certain (equilibrium) values. This type of information is important becausc it 
fills a major gap that marred our siudy of statics and comparative statics. In the lator, we 
always make the arbitrary assumption that the process of economic adjustment incvitably 
leads to an cquilibrium. In a dynamic analysis, the question of “attainability” is to be 
squarely faced, rather than assumed away. 

One salient feature of dynamic analysis is the dating of the variables, which introduces 
the explicit consideration of time into the picture. This can be done in two ways: time can 
be considered either as a continuous variable or as a discrete variable. 1n the former case, 
something is happening to the variable at cach point of time (such as in continuous interest 
compounding}; whereas in the latter, the variable undergoes a change only once within a 
period of time (e.g., interest is added only al the end of cvery 6 months). One of these time 
concepts may be more appropriate than the other in certain contexts. 

We shall discuss first the continuous-time case, to which the mathematical techniques of 
integral calculus and differential equettions are pertinent. Later, in Chaps. 17 and 18, we 
shall turn to the discrete-time case, which utilizes the methods of difference equations. 


14.1 Dynamics and Integration 





In a static model, generally speaking, the problem is to find the values of the endogenous 
variables that satisfy some specified equilibrium condition(s). Applied to the context of 
optimization models, the task becomes one of finding the values of the choice variables 
that maximize (or minimize) a specific objective function with the first-order condi- 
tion serving as the equilibrium condition. In a dynamic model, by contrast, the problem 





* Fritz Machlup, “Statics and Dynamics: Kaleidoscopic Words,” Southern Economic Journal, October 
1959, pp, 91-110; reprinted in Machlup, Essays on Economic Semantics, Prentice-Hall, Inc., 
AAA Englewood Cliffs, N.J., 1963, pp. 9-42. 


Chapter 14 Economic Dynamics anel imegral Calculus 445 


involves instcad the delineation of thc time path of some variable, on the basis of a known 
pattern of change (say, a given instantaneous rate of change). 
An cxample should make this clear. Suppose that population size #/ is known to change 
over time at the rate 
aH jap 


= =! (14.1) 


We then try to find what time path(s) of population H = H(#) can yield the rate of change 
in (4.)). 

You will recognize that, if we know the function H = //(1) to begin with, the derivative 
dH /dt can be found by differentiation. But in the problem now confronting us, the shoe ig 
on the other foot: we are called upon to uncover the primitive function from a given derived 
function, rather than the reverse. Mathematically, we now need the exact opposile of the 
method of differentiation, or of differential calculus. 

The relevant method, known as integration, or integral calculus, will be studied in this 
chapter. For the time being, let us be content with the observation that the function 
11(t) = 21"? does indeed have a derivative of the form in (14.1), thus apparently qualify- 
ing as a solution to our problem. The trouble is that there also cxist similar functions, such 
as H(t) = 22'? 4.15 of Ht) = 2¢'? + 99 or, more generally, 





A(t) =2t'"? te (¢ = am arbitrary constant) (14.2) 


which all possess exactly the same derivative (14.1), No unique lime path can be deter- 
mined. therefore, unless the value of the constant ¢ can somehow be made definite. To 
accomplish this, additional information must be introduced into the model, usually in the 
form of what is known as an initial condition or boundary condition. 

If we have knowledge of the initial population H(0)—that is, the value of Hat = 0, let 
us say, 7(0) = 100—then the value of the constant ¢ can be made determinate. Setting 
t= 0 in (14,2), we get 


HQ) = 200)? te =e 
But if H(0) = 100, then ¢ = 100, and (14.2) becomes 
H(t) = 21! + 100 (14,2) 


where the constant is no longer arbitrary. More generally, for any given initial population 
#1(0), the time path will be 


A(t) = 20? 4 110) (14.2) 


Thus the population size H at any point of time will, in the present example, consist of the 
sum of the initial population F7(0) and another term involving the time variable ¢. Such a 
time path indeed charts the complete itinerary of the variable H over time, and thus it truly 
constitutes the solution to our dynamic model. [Equation (14,1) is also a function of . Why 
can’t if be considered a solution as well’ 

Simple as it is, this population example illustrates the quintessence of the problems of 
economic dynamics. Given the pattern of behavior of a variable over time, we seek to find 
a function that describes the time path of the variable. In the process, we shall encounter 
one or more arbitrary constants, but if we possess sufficient additional information in the 
form of initial conditions, it will be possible to definitize these arbitrary constants. 


446 Part Five Pynamie Anulusis 


In the simpler types of problem, such as the one just cited, the solution can be found by 
the method of integral calculus, which deals with the process of uacing a given derivative 
function back to its primitive function. In more complicated cases, we can also resort to the 
known techniques of the closely related branch of mathematics known as differential equa- 
tions. Since a differential equation is defined as any equation containing differential or 
derivative expressions, (14,1) surely qualifies as one; consequently, by finding its solution, 
we have in fact already solved a differential equation, albcit an exceedingly simple one. 

Let us now proceed to the study of the basic concepts of integral calculus. Since we dis- 
cussed differential calculus with x (rather than ¢) as the independent variable, for the sake 
of symmetry we shall use x here, too. For convenience, however, we shall in the present dis- 
cussion denote the primitive and derived functions by F(x} and f(x), respectively, rather 
than distinguish them by the use of a prime. 


14.2 _ Indefinite Integrals 





The Nature of Integrals 
It has been mentioned that integration is the reverse of differentiation. Il differentiation of 
a given primitive function F(x) yields the derivative f(x), we can “integrate” f (2) to find 
F(x), provided appropriate information is available to definitize the arbitrary constant 
that will arise in the process of integration. The function &(x) is referred to as an integral 
(or antiderivative) of the function f(x). These two types of process may thus be likened to 
two ways of studying a family tree: integration involves the tracing of the parentage of the 
function f(x), whereas differentiation secks out the progeny of the function F(.r). But note 
this difference—while the (differentiable) primitive function F(x) invariably produces a 
Jone offspring, namely, a unique derivative /(.r), the derived function /(+) is traceable to 
an infinite number of possible parents through integration, because if F(x) is an integral of 
F(x), then so also must be (x) plus any constant, as we saw in (14.2). 

We necd a special notation ta denote the required integration of (+) with respect to x. 
The standard one is 

[reo dx 


The symbol on the left an elongated 5 (with the connotation of sum, to be explained 
later)\—is called the integral sign, whereas the f(x) part is known as the integrand (the 
function to b¢ integrated), and the dy part—similar to the dx in the differentiation operator 
d/dx—teminds us that the operation is to be performed with respect to the variable x. 
However, you may also take f(x) dx as a single entity and interpret it as the differential 
of the primitive function F(x) [that is, d(x) = f{x) dx]. Then, the integral sign in front 
can be viewed as an instruction to reverse the differentiation process that gave rise to the 
differential. With this new notation, we can write that 


£ (0) = fle) > [ronar=Feate (14.3) 


where the presence of c, an arbitrary constant of integration, setves to indicate the multiple 
parentage of the integrand. 


Example 1 
Example 2 


Example 3 


Example 4 


Example 5 


Chapter 14 diconumic Dynamics and Integral Caleutus 447 


The integral f f(x) dx is, more specifically, known as the indefinite integral of f(x) (as 
against the definite integral to be discussed in Sec. 14.2), because it has no definite numer- 
ical valuc. Because it is equal to F(x) +, its value will in general vary with the value of 
x (even if c is definitized). Thus, like a derivative, an indefinite integral is itself'a function 
of the variable x, 


Basic Rules of Integration 

Just as there are rules of derivation, we can also develop certain rules of integration. As may 
be expected, the jatter are heavily dependent on the rules of derivation with which we are 
already familiar. From the following derivative formula for a power function, 


d fx , 
= =x" az—l 
dx (; + i) we 

for instance, we see that the expression x""!/(a + 1) is the primitive function for the 


derivative function x"; thus, by substituting these for F(x) and f(x) in (14.3), we may 
slate the result as a rule of integration. 





Rule] (the power rule) 


| 
fears 7 pt ite (1 #-1) 





Find fodx. Here, we have 1 = 3, and therefore 
1 
Fdx= oxi te 
| x a + 


Find fx dx. Since n = 1, we have 


1 
frac 3 +e 


What is / 1 dx? To find this integral, we recall that x° = 1; so we can let 7 = 0 in the power 


rule and get 
fr dxraxte 


[f 1 dx is sometimes written simply as f dx, since 1 dx = dx.] 
Find f Vx3 dx. Since Vx3 = x32, we have n= 3; therefore, 
5/2 
yy x 2 
[ve dra" 4e=$ VB +e 
2 


mn 
Find | att (x #0). Since 1/4 = x-4, we have n= —4, Thus the integral is 


1 xl 1 
Ie dea Ty team aa te 


Note that the correctness of the results of integration can always be checked by differ- 
entiation; if the integration process is correct, the derivative of the integral must be cqual to 
the integrand. 


448 Part Five Dynamic dnalsis 


The derivative formulas for simple exponential and logarithmic functions have been 
shown to be 


d d i 
ze =e* and om Inx = z (x > 0) 


From these, two other basic rules of integration cmerge. 


[earmere 


1 
[paramere (x > 0) 
x 


Rule I (the exponential rule) 
Rule TM (the logarithmic rule) 


It is of interest that the integrand involved in Rule [Il is 1/x = x7!, which is a special 
form of the power function x” with » = —] . This particular integrand is inadmissible under 
the power rule, but now is duly taken care of by the logarithmic rule. 

As stated, the logarithmic rule is placed under the restriction x > 0, because logarithms 
do not exist for nonpositive values of x. A more general formulation of the rule, which can 
lake care of negative values of x, is 


[5 asain +e (x £0) 
x 


which also implies that (d/dx)In |x| = I/x, just as (¢/dx) Inx = 1/x. You should con- 
vince yourself that the replacement of x (with the restriction x > 0) by |x| (with the 
restriction x # 0) does not vitiate the formula in any way. 


. . . | 
Also, as a matter of notation, it should be pointed out that the integral | dx is 
. : dx * 
sometimes also written as | _. 
x 


As variants of Rules [1 and III, we also have the following two rules 


Role Tha 
[reel ax Hel 46 
Rule Hla 
Le) dx=infyte [flo > 0] 
f&) 
or Inifyite [fy 40] 


The bases for these two rules can be found in the derivative rules in (10.20). 


Rules of Operation 
The three preceding rules amply illustrate the spirit underlying all rules of integration. Each 
rule always corresponds to a certain derivative formula. Also, an arbitrary constant is 


Example 6 


Example 7 


Chapter 14° Economic Dynamics and Integral Cateulus 449 


always appended at the end (even though it is to be definitized later by using a given bound- 
ary condition) to indicate that a whole family of primitive functions can give rise to the 
given integrand. 

To be able (o deal with more complicated integrands, however, we shall also find the 
following two rules of operation with regard to integrals helpful. 


Rule TV (the integral of a sum) The integral of the sum of a finite number of functions 
is the sum of the integrals of those functions. For the two-function case, this means that 


[rr+sonde= froac+ [ores 
This rule is a natural consequence of the fact that 


d d d 

kz [F(x) + Gi = a) + nq G(x) = f(x) + gla) 
ax oa x SS a 
oS SO EO 


¢ 
4 B 


Inasmuch as 4 = C, on the basis of (14.3) we can write 
[treo +etntas = F(x) + Gar) te (14.4) 
But, from the fact that B = C, it follows that 
[re dx=Fix)te, and fan dx =Gla)ter 
Thus we can obtain (by addition} 
[re dx + [as dx = F(x) + Ge) te, te (14.5) 


Since the constants c, cy, and c2 are arbitrary in value, we can let ¢ = c, + cy. Then the 
right sides of (14.4) and (14.5) become equal, and as 4 consequence, their left sides must 
be cqual also. This proves Rule IV. 


Find {(x? +x +1) dx, By Rule IV, this integral can be expressed as a sum of three integrals: 
[@dx+fxdx+ f1 dx, Since the values of these three integrals have previously been 
found in Examples 1, 2, and 3, we can simply combine those results to get 
4 2 4 2 
3 ={% x at ye 
[u teend=(F +a)+(5 +) +6+6)= gtytete 


In the final answer, we have lumped together the three subscripted constants inta a single 
constant c 


As a general practice, all the additive arbitrary constants of integration that emerge dur- 
ing the process can always be combined into a single arbitrary constant in the final answer. 





Find [ (ee + ies) dx, By Rule IV, we can integrate the two additive terms in the 


integrand separately, and then sum the results. Since the 2e?* term is in the format of 
f(x)e!™ im Rule Ifa, with f(x) = 2x, the integral is e* +c). Similarly, the other term, 


450 Part Five Dynamic Anahsis 


Example 8 


Example 9 


Example 10 


14x/(7x? + 5), takes the form of F(x) FG, with fd = 7x7 +5 > 0. Thus, by Rule Illa, the 
integral is In(7x? + 5) + cy. Hence we can write 





( 14, 
[ (2+ eng) =e ne +9) +e 


where we have combined ¢ and cz into one arbitrary constant c. 


RuleV (the integral of a multiple) The integral of & times an integrand (A being a con- 
stant) is & times the integral of that integrand. In symbols, 


[iveras = tf foes 


What this rule amounts to, operationally, is that 4 multiplicative constant can be “factored 
out” of the integral sign. (Warning: A variable term cannot be factored out in this fashion!) 
To prove this rule (far the case where & is att integer), we recall that & times f(x} merely 
means adding f(x) & times: therefore, by Rule EV, 


[sreas = {igen +f te + f(xy] dx 
ae 


= [ro axt fr adxt- + [rs deh [an dx 
a 
k terms 


Find f —f(x) dx. Here k = —1, and thus 


/ iQ) de= - fre dx 


That is, the integral of the neqative of a function is the negative of the integral of that 
function. 


Find 2x? dx. Factoring out the 2 and applying Rule |, we have 
, 3 
[etam2feax-2(5 +a) = Perc 


Find 3x? dx. in this case, factoring out the multiplicative constant yields 


3 
[3tax=3 fearaa(S4a)=0 +e 


Note that, in contrast to the preceding example, the term x? in the final answer does not 
have any fractional expression attached to it. This neat result is due to the fact that 3 (the 
multiplicative constant of the integrand) happens to be precisely equal to 2 (the power of 
the function) plus 1. Referring ta the power rule (Rule 1), we see that the multiplicative con- 
stant (n + 1) will in such a case cancel out the fraction 1/(n+ 1), thereby yielding (x*! +c) 
as the answer. 


In general, whenever we have an expression (n + 1)x” as the integrand, there is really 
no need to factor out the constant (# + 1} and then integrate x”; instead, we may write 
x"! 4 @ ag the answer right away. 


Example 11 


Example 12 


Chapter 14 Economic Dynamics and Integral Caiculus 451 


, 3 
Find f (ser ax 74 :) dx, (x #0). This example illustrates both Rules IV and V; actually, 
it illustrates the first three rules as well: 
1 f 1 
| (5e7— Js + 5 )ac=5 forte fox rox +3f dx [by Rules lV and V] 
xl 
= (Se*+¢1) — (5 + a) + (3in |x! + 63) 
ase 4 ha 3inulte 
x 

The correctness of the result can again be verified by differentiation, 
Rules Involving Substitution 
Now we shall introduce two more rules of integration which seek to simplify the process 
of integration, when the circumstances are appropriate, by a substitution of the original 


variable of integration. Whenever the newly introduced variable of integration makes the 
integration process easier than under the old. these rules will become of service. 





Rule VI (the substitution rule) The integral of f(«)(du/dx) with respect lo the vari- 
able x is the integral of f(w) with respect to the variable w 


[ro a oy = [roar = Flu)+e 
dx 


where the operation f du has been substituted for the operation fdx. 
This rule, the integral-caleulus counterpart of the chain rule, may be proved by means of 
the chain rule itself, Given a function F(z), where v = u(x), the chain mule states that 
d d du » du du 
= F(u) = =- Fy = Fw) = flu) — 
dx 4) du (Or Wr JOT 


Since f{u)(du/dx) is the derivative of F(u), it follows from (14.3) that the integral (anti- 
derivative) of the former must be 


du 
[rot dx = Flu) +e 
dx 
You may note that this result, in fact, follows also from the canceling of the two dx expres- 


sions on the left. 


Find f 2x(x2+1)dx. The answer to this can be obtained by first multiplying out the 
integrand: 
4 
foes nara [ae +20de— 5 4x he 


but let us now do it by the substitution rule. Let u=x2+1; then dujdx = 2x, or 
dx = du/2x, Substitution of du/2x for dx will yield 


du 
2x(x? + 1) dx = f 2xu = 
[02 +1yax= Jaw = 


= lot s2tante attartee 





where ¢ = tia. The same answer can also be obtained by substituting du/dx for 2x 
(instead of du/2x for dx). 


AS2° Part Five Dynantic Analysis 


Example 13 


Example 14 


Find f6x?(x3 + 2)% dx. The integrand of this example is not easily multiplied out, and thus 
the substitution rule now has a better opportunity to display its effectiveness. Let 
u=x?42; then du/dx = 3x2, so that 


[owe $2)" dx= il (23) u? dx = 
dx J 


= 2-100 —i 3 190 
= T00" Hom aylk +2) 4+¢ 


2u°? du 





Find (8e**+3 dx. Let u= 2x +3; then du/dx = 2, or dx = du/2. Hence, 
[ser da [0a fet dn ters c= 40 +e 


As these examples show, this rule is of help whenever we can—by the judicious choice 
of a function v = u(x) -express the integrand (a function of x) as the product of /(u) 
(a function of u) and du/dy (the derivative of the v function which we have chosen), How- 
ever, as illustrated by the last two examples, this rule can be used also when the original 
integrand is transformable into a constant multiple of f(#)(du/dx). This would not affect 
the applicability because the constant mulliplier can be factored out of the integral sign, 
which would then leave an integrand of the form /(+)(du/dx), as required in the substitu- 
tion rule. When the substitution of variables results in a variable multiple of f(u)(du/dx), 
say, x times the latter, however, factoring is not permissible, and this rule will be of no help. 
In fact, there exists no general formula giving the integral of’ a product of two functions in 
terms of the separate integrals of those functions; nor do we have a general formula giving 
the integral of a quotient of two functions in terms of their separate integrals, Herein lies 
the reason why integration, on the whole, is more difficult than differentiation and why, 
with complicated integrands, it is more convenient to look up the answer in prepared lables 
of integration formulas rather than to undertake the integration by oneself. 


Rule VIL (integration by parts) The integral of v with respect to # is equal to wv less 
the integral of x with respect to v: 


fodu =n fae 


The essence of this rule is to replace the operation f du by the operation fide. 
The rationale behind this result is relatively simple. First, the product rule of differen- 
tials gives us 


d(uv) = vdut+udo 


If we integrate both sides of the equation {i.c., integrate each differential), we get a new 


equation 
[awn= Jodie [rae 


or w= | udu + [uae [no constant is needed on the left (why?}] 


Then, by subtracting f dv from both sides, the previously stated result emerges. 


Example 15 


Example 16 


Example 17 


Chapter 14 Heanomie Denaniics and integral Calculus 453 


Find {x(x +1)? dx. Unlike Examples 12 and 13, the present example is not amenable to 
the type of substitution used fn Rule VI. (Why?) However, we may consider the given inte- 
gral to be in the form of /v du, and apply Rule VIL, To this end, we shall let v = x, implying 
dv = dx, and also let v= 3(x +1), so that du =(x +1)" dx, Then we can find the 
integral to be 


fone dx = [vdu=w- [ua 
= Forex [Foe ni dx 


2 372, 4 5/2 
= 3041) x gat) +¢ 


Find f in x dx, (x > 0). We cannot apply the logarithmic rule here, because that rule deats 
with the integrand 1/x, not In.x. Nor can we use Rule VI. But if we let vy =In x, implying 
dv = (1/%) dx, and also let u = x, so that du = dx, then the integration can be performed 


as foltows: 
[irxax= frdu-w— far 


=Kinx~ f de —abnx—x bem alan 1) +c 


Find fxe* dx. In this case, we shall simply let v =x, and u=e*, so that dv =dx and 
du = e* dx. Applying Rule Vil, we then have 


[retax= f vdu—ur= f ud 


sete fetdxaetx—et tem eI) +e 


The validity of this result, like those of the preceding examples, can of course be readily 
checked by differentiation. 





EXERCISE 14.2 


1. Find the following: 
@ / Voxdx (x #0) (d) { 2e* dx 
8 4x 
w [9 dx @ fre 
© | (x5 — 3x) dx (a) | (2ax + by(ax? + bx)? dx 
2, Find: 
(a) / 13e* dx @) | 3 PHN gy 


(b) [De + *) dx (x>0) @ / Axe? 3 de 


(0 [ (se + (f) fre 9 dy 





454 Part Five Dyvtarnic Analysis 


14.3 Definite Integrals 


3. Find: 
2 
@ [2 ws of ryan 
, 
wf wen Of sage 
4, Find: 
(a) [o + 3+ 1)? dx (b) | xinxdx (x >0) 


5. Given n constants &; (with {= 1,2,...,) and n functions f(x), deduce from Rules IV 
and V that 


[Soatma= So [rere 
j=l fat 





Example 1 


Meaning of Definite Integrals 

All the integrals cited in Sec. 14.2 are of the indefinite variety: each is a function of a vari- 
able and, hence, possesses no definite numerical value. Now, for a given indefinite integral 
of a continuous function f(x), 


[ro dx = F(x)+e 


if we choose two values of x in the domain, say, a and b (a < 5), substitute them succes- 
sively into the right side of the equation, and form the difference 


[F(6) +c] -[Fla) +e] = F(b) = Fla) 


we get a specific numerical valuc, free of the variable x as well as the arbitrary constant ¢. 
This value is called the definite integral of f(x) from a to 5. We refer to a as the lower limit 
of integration and to F as the waper limit of integration. 

In order to indicate the limits of integration, we now modify the integral sign to the form 


nb 
i . The evaluation of the definite integral is then symbolized in the following steps: 


he b 
fac= | = F(b)— F(a) (14.6) 


where the symbol J" (also written (? or (---]2) is an instruction to substitu 4 and a, suc- 
cessively, for x in the result of integration to get F{) and F(a), and then take their 
difference, as indicated on the right of (14.6), As the first step, however, we must find the 
indefinite integral, although we may omit the constant c, since the latter will drop out in the 
process of diflerence-taking anyway, 


$8 
Evaluate 3x? dx. Since the indefinite integral is x? + c, this definite integral has the value 
1 


5 5 
[sra-e] =(5)? ()=125— 1124 
1 1 


Example 2 


Example 3 


Example 4 


Chapter 14 Economic Dynamics and Integral Calculus 455 


b 
Evaluate ke* dx. Here, the limits of integration are given in symbols; consequently, the 
a 


result of integration is also in terms of those symbols: 

b 6 
[ ke* dx = ie = ke’ — e%) 
a 


a 


47 
Evatate f (as +2) dx, (x #—1). The indefinite integral is In|1 + x] + x? 4c thus 
( 


the answer is 


4 1 14 
[ (+25) dx =[In1 +x +x ls 


= (In5 + 16) - (In1 +0) 
=In5+16 — [sinceln! =0) 


It is important to realize that the limits of integration a and both refer to values of the 
variable x, Were we to use the substitution-of-variables technique (Rules VI and VIN) dur- 
ing integration and introduce a variable z, care should be taken nor to consider a and ) as 
the limits of z, Example 4 will illustrate this point. 


2 
Evaluate (2x3 — 1)?(6x?) dx. Let u= 2x3 - 1; then du/dx = 6x2, or du = 6x2 dx. Now 
1 


notice that, when x = 1, u will be 1 but that, when x = 2, u will be 15; in other words, 
the limits of integration in terms of the variable u should be 1 (lower) and 15 (upper). 
Rewriting the given integral in y will therefore give us not fe du but 

[ve du = a) = tas -13)=1,1242 

cal 3 1 3 ‘ 3 


Alternatively, we may first convert u back to x and then use the original limits of 1 and 2 to 
get the identical answer: 


1 5 uals 1 3 ai 1 4 
- =| <(28- = -(1599 y= 2 
[5 I [50 vf 315 VP) =1,1245 


A Definite Integral as an Area under a Curve 
Every definite integral has a definite value. That value may be interpreted geometrically to 
be a particular area under a given curve. 

The graph of a continuous function y = f(x) is drawn in Fig. 14.1. If we seek to mea- 
sure the (shaded) area 4 enclosed by the curve and the x axis between the two points « and 6 
in the domain, we may proceed in the following manner. First, we divide the interval [a, 6] 
into n subintervals (not necessarily equal in length), Four of these are drawn in Fig. 14. la-— 
that is, = 4 the first being [x), x2] and the last, [x4, 75}. Since each of these represents 
a change in x, we may refer to them as Ax,,..., Axa, respectively. Now, on the subinter- 
vals let us construct four rectangular blocks such that the height of each block is equal 
to the highest value of the function attained in that block (which happens to occur at 
the left-side boundary of each rectangle here). The first block thus has the height f(x,) and 


A456 Part Five Dvnaneic Analysis 


FIGURE 14.1 * 








a ne ee ae ae: x 


1 ay 8 
ay) 


{a} 


45 
(=) 








(8) 


the width Ax), and, in general, @ 
total area 4* of this set of blocks 


é ith block has the height f(x;) and the width Ax;. The 
is the sum 


Ate Y f(x) Ot; (a = 4 in Fig.14.Ja) 
f=l 


This, though, is obviously net i 
approximation thereof, 
What makes A* deviate from 


e area under the curve we seek, but only a very rough 


the true value of 4 is the unshaded portion of the rectan- 


gular blocks; these make 4" an overestimate of 4, If the unshaded portion can be shrunk 
in size and be made to approach zero, however, the approximation value 4* will corre- 


spondingly approach the true va 
finer segmentation of the interval 


ue A. This result will materialize when we try a finer and 
[a, b], so that # is increased and Ax, is shortened indefi- 





nitely. Then the blocks will become more slender (if more numerous), and the protrusion 


beyond the curve will diminish, 


“slenderizing” operation yields 


as can be scen in Fig. 14.16, Carried to the limit, this 


lim F(x) Aa; = Jim, = area A (14.7) 


Chapter 14. Economic Dynamics and Integral Catenlus 457 


provided this limit exists. (It does in the present case.) This equation, indeed, constitutes the 
formal definition of an area under a curve. 


The summation expression in (14,7), > J(s;) Ax;, bears a certain resemblance to the 
b f= 
definite integral expression f f(x) dx. Indeed, the latter is based on the former. The 


replacement of Ax; by the differential dr is done in the same spirit as in our earlier discus- 
sion of “approximation” in Sec. 8,1. Thus, we rewrite f(x;} Ax; into f(r) dx. What about 


fn 
the summation sign? The > notation represents the sum of a finite number of terms, When 

i=l 
we let 2 > oo, and take the limit of that sum, the regular notation for such an operation is 


A 
rather cumbersome. Thus a simpler substitute is needed. That substitute is / , where the 
a 


elongated § symbol also indicates a sum, and where a and 6 (just as? = 1 and ”) serve to 
specify the lower and upper limits of this sum. In short the definite integral is a shorthand 
for the limit-of-a-sum expression in (14.7). That is, 





b 


f(x) dx = linn f(xy) Axj = area A 
| nr Sy 


Thus the said definite integral (referred to as a Riemans integral) now has an area conno- 
A 


tation as well as a sum connotation, because is the continuous counterpart of the 
cal a 
discrete concept of x 
i=I 
Tn Fig. 14.1, we attempted to approximate area 4 by systematically reducing an over- 
estimate A” by finer segmentation of the interval [a, 4]. The resulting limit of the sum of 
block areas is called the upper integral—an approximation from above. We could also have 
approximated area A from below by forming rectangular blocks inscribed by the curve 
rather than protruding beyond it (see Excrcise 14.3-3). The total area 4* of this new set of 
biocks will uaderestimate A, but as the segmentation of [a, 6] becomes finer and fincr, we 
shall again find lim. A®™ = A. The last-cited limit of the sum of block areas is called the 








lower integral. It "and only if, the upper integral and lower integral are equal in value, then 
the Ricmann integral { f(x) dx is defined, and the function f(x) is said lo be Riemann 


a 
integrable. There exist theorems specifying the conditions under which a function f(x) is 
integrable. According to the fundamental theorem of calculus, a function is integrable in 
[a, 4] if it is continuous in that interval. As long as we ate working with continuous func 
tions, therefore, we should have no worries in this regard. 

Another point may be noted. Although the area 4 in Fig. 14.1 happens to lic entirely 
under a decreasing portion of the curve y = f(x), the conceptual equating of a definite in- 
tegral with an area is valid also for upward-sloping portions of the curve. In fact, both types 


of slope may be present simultaneously; ¢.g., we can calculate [ f(x) dx as the area 
0 


under the curve in Fig, 14.1 above the line Od. 


458 Part Five Dynamic Analvsis 


FIGURE 14.2 


y=fa 











h 


Note that, if we calculate the area B in Fig. 14.2 by the definite integral f(%) dx, the 


a 
answer will come out negative, because the height of cach rectangular block invol 


ved in 


this area is negative. This gives rise to the notion of a negative area, an area thal lies below 


thex axis and above a given curve. In case we arc interested in the numerical rather # 
algebraic value of such an area, therefore, we should take the absolute value of the r 


han the 
elevant 


definite integral. The area C = [ f(x) dx, on the other hand, has a positive sign even 


though it lies in the negative region of the x axis; this is because cach rectangular block has 


a positive height as well as a positive width when we are moving from c to d. From tl 


is, the 


implication is clear that interchange of the two limits of integration would, by reversing the 


direction of movement, alter the sign of Ax; and of the definite integral. Applied to 
77 
we see that the definite integral [ f(x) dx (from b to a) will give the negative of t 
6 
B; this will measure the numerical value of this area, 


Some Properties of Definite Integrals 
The discussion in the preceding paragraph leads us to the following property of 
integrals, 


Property I The interchange of the limits of integration changes the sign of the 
integral: 
a 


b 
fiajdxa—f f(xydx 


b a 


This can be proved as follows: 


tb 
Ph (x) dx = Fla) — F(b) = -[F() — F(a] = — ff) dx 


it 
Definite integral also possess some other interesting properties, 
Property IT A definite integral has a value of zera when the (wo limits of integrat 


identical: 


[re dx = F(a) — F(a) = 


area B, 


ie area 


cfinite 





lefinite 


ion are 


FIGURE 14.3 


Chapter 14 Economic Dynamics and Integral Cukculus 459 


Under the “area” interpretation, this means that the area (under a curve) above any sin- 
gle point in the domain is nil. This is as it should be, because on top of a point on the x axis, 
we can draw only a (one-dimensional) fine, never a (two-dimensional) area. 


Property IIL A definite integral can be expressed as a sum of a finite number of definite 
subintegrals as follows: 


at Bt et af 
[ fis)dxy = fe f(x) aa f fs) at f f(x) ax (a<h<c<d) 
a a b c 


Only three subintegrals are shown in this equation, but the extension to the case of n 
subintegrals is also valid. This property is sometimes described as the additivity property. 

In terms of area, this means that the area (under the curve) lying above the interval (a, d] 
on the x axis can be obtained by summing the areas lying above the subintervals in the set 
{[a, 8]. [, ce], [c, @]}. Note that, since we are dealing with closed intervals, the border 
points > and c have each been included in nvo areas. Is this not double counting? It indeed 
is. But fortunately no damage is done, because by Property I] the arca above a single point 
is zero, so that the double counting produces no effect on the calculation. But, necdless to 
say, the double counting of any interval is never permitted. 

Earlier, it was mentioned that all continuous functions are Riemann integrable. Now, by 
Property ITI, we can also find the definite integrals (areas) of certain discontinuous func- 
tions. Consider the step function in Fig, 14,32, In spite of the discontinuity at point 6 in the 
interval [a, c], we can find the shaded arca from the sum 


b c 
[ fievacs [ feyar 
at b 


The same also applies to the curve in Fig. 14.35. 


A by 
[ —fixjdx= -f f(x) dx 


[ * ef) dx =k [ * fla) de 


Property IV 


Property V 








460 Part Five Dinamic Analvsis 
Property VI 
a ab » 
[ Lf) + g(x)] dx = | f(x) dx +f g(x} dx 
a a @ 
Property VII (integration by parts) Given u(x) and v(x), 


sb 1b sch 
udu=uv - udu 
Jx-a eed dye a 


These last four properties, all borrowed from the rules of indefinite integration, should 
require no further explanation. 





Another Look at the Indefinite Integral 
We introduced the definite integral by way of attaching two limits of integration to an 
indefinite integral. Now that we know the meaning of the definite integral, let us scc how 
we can revert from the latter to the indefinite integral. 

Suppose that, instead of fixing the upper limit of integration at 6, we allow it to be a 
variable, designated simply as x. Then the integral will take the form 


[ fix) dx = F(x) - F(a) 


which, now being a function of x, denotes a variable arca under the curve of f{x). But 
since the last term on the right is a constant, this integral must be a member of the family 
of primitive functions of f(x), which we denoted earlier as F(x) +c. If we setc = — F(a). 
then the above integral becomes exactly the indefinite integral f f(x) dx 

From this point of view, therefore, we may consider the f symbol to mean the same as 





< 
| , provided it is understood that in the latter version of the symbol the lower limit of 
a 


integration is related to the constant of integration by the equation c = —F{a). 





EXERCISE 14.3 


1, Evaluate the following: 


@ [ ye de (@) [ “et 6) dx 


of xO? + 6) dx of. (ox? + bx +0) dx 
0 


(9 f 3m wf (5 x 1a 


2. Evaluate the following: 


p2 3 
@ | a dy a) [ (e* + e%) dx 


wf a of (b+ ag)e 





Chapter 14 Economic Dynamics and integral Calculus 461 


3. In Fig. 14.1a, take the towest value of the function attained in each subinterval as the 
height of the rectangular block, i.e., take f(x2) instead of f(x1) as the height of the first 
block, though stil retaining Ax as its width, and do likewise for the other blocks. 

(a) Write a summation expression for the total area A** of the new rectangles. 

(b) Does A* overestimate or underestimate the desired area A? 

(Q) Would A* tend to approach or to deviate further from A if a finer segmentation of 
{a, 6] were introduced? (Hint: Try a diagram.) 

(d) In the limit, when the number 7 of subintervals approaches oo, would the approxi- 
mation value A** approach the true value A, just as the approximation value A* did? 

(e) What can you conclude from (a) to (d) about the Riemann integrability of the 
function f(x) in Fig. 14.14? 


b 
4. The definite integral [ f(x) dx is said to represent an area under a curve. Does this 


a 
curve refer to the graph of the integrand f(x), or of the primitive function F(x)? If we 
plot the graph of the F(x) function, how can we show the given definite integral on 
it—by an area, a tine seqment, ora point? 
5. Verify that a constant c can be equivalently expressed as a definite integral: 


b c 
ce f jax wen Prat 


14.4 Improper Integrals 


Certain integrals are said to be “improper.” We shall briefly discuss two varieties thereof. 





Infinite Limits of Integration 
When we have definite integrals of the form 


x A 
[reves and [ fora 


with onc limit of integration being infinite, we refer to them as improper integrals. In these 
cases, it is not possible to evaluate the integrals as, respectively, 

Foo} — Fa) and F(6) — F(—0c) 
because oo is not a number, and therefore it cannot be substituted for x in the function 
F(x), Instead, we must resort once more to the concept of limits, 
he first improper integral we cited can be defined to be the limit of another (proper) 
integral as the latter’s upper limit of integration tends to 90; that is, 


oo é 
[ f(x)dx = uf T(x) dx (14,8) 


If this limit exists, the improper integral is said to be convergent (or to converge), and the 
limiting process will yield the value of the integral, If the limit does not exist, the improper 
integral is said to be divergent and is in fact meaningless, By the same token, we can define 
A 


[ flxhde = lim J for)dx (14.8) 


a 











with the same criterion of convergence and divergence. 


462 Part Five Dynamic Analysis: 


Example 1 


Example 2 


FIGURE 14.4 


dx. 
Evaluate Pe First we note that 
1 


b 


Pax -17 1 
hoe ox 4d, BD 


Hence, in line with (14.8), the desired integral is 


Oe tim [& = im (a) ar 
[Gein Sim, otis 


This improper integral does converge, and it has a value of 1. 
Since the limit expression is cumbersome to write, some people prefer to omit the “lim” 


notation and write simply 
{ om 3] =0+1=1 
AX eh 


Even when written in this form, however, the improper integral should nevertheless be 
interpreted with the limit concept in mind. 

Graphically, this improper integral still has the connotation of an area. But since the 
upper limit of integration is allowed to take on increasingly larger values in this case, 
the right-side boundary must be extended eastward indefinitely, as shown in Fig. 14.44. 
Despite this, we are able to consider the area to have the definite (limit) value of 1. 


* dx “et fi 
Evaluate [ —-. AS before, we first find 
1 x 


b b 
[ ina] = Inb—In1=inb 
1 


x 1 


When we let b— oo, by (10.16'} we have In b > 90, Thus the given improper integral is 
divergent. 

Figure 14.4b shows the graph of the function 1/x, as well as the area corresponding to 
the given integral. The indefinite eastward extension of the right-side boundary will result 
this time in an infinite area, even though the shape of the graph displays a superficial 
similarity to that of Fig. 14.40, 


f@) fs) 











O 1 be te x oO 





(a) 


Example 3 


Example 4 


Chapter 14 Economic Dynamics and Integral Calculus 463 


What if both limits of integration are infinite? A direct extension of (14.8) and (14.8) 
would suggest the definition 


| T(sydx 


Again, this improper integral is said to converge if and only if the limit in question exists. 


(14.8") 





Infinite Integrand 

Even with finite limits of integration, an integral can still be improper if the integrand be- 
comes infinite somewhere in the interval of integration [a, P]. To evaluate such an integral. 
we must again rely upon the concept of a limit. 


1 

1 

Evaluate [ x dx. This integral is improper because, as Fig. 14.4b shows, the integrand is 
0 


infinite at the lower limit of integration (1/x + oc as x > 07). Therefore we should first 
find the integral 


1 1 
[ yaks] =In1l-Ina=-Ina — ffora>O] 
a X 


a 


and then evaluate its limit as a + 0°: 


[: -dx = im ft -dx= lim (- Ina) 
a0! aot 


Since this limit does not exist (as a 0+, Ind — —oo), the given integral is divergent. 


9 
Evaluate [ x dy When x — 0+, the integrand 1//% becomes infinite; the integral is 
0 


improper. Again, we can first find 
I 9 
[ x? dx = 2s! =6-2Ja 
@ ¢ 


The limit of this expression as a + 0* is 6-0 = 6, Thus the given integral is convergent 
(to 6), 


The situation where the integrand becomes infinite at the upper limit of integration is 
perfectly similar. It is an altogether different proposition, however, when an infinite value 
of the integrand occurs in the open interval (a, 4) rather than at @ or b. In this eventuality, 
it is necessary to take advantage of the additivity of definite integrals and first decompose 
the given integral into subintegrals. Assume that f(x) > oo as.x > p, where p is a point 
in the interval (a, 5); then, by the additivity property, we have 


q Pp + 
[ fix)dr = [ plede+ | feds 
a a # 


The given integral on the left can be considered as convergent if and only if each subinte- 
gral has a limit. 


464 Part Five Dynamic Analysis 


Example 5 


1 


1 
Evaluate zg dx. The integrand tends to infinity when x approaches zero; thus we must 
1 


write the given integral as the sum 
1 i) 1 
[ xax= | tare | x3dx (say, =H th) 
1 “1 0 
The integral f is divergent, because 
& -1, 37° 11 
ki Pan tim [Px 2) = im (-s545) <2 
di fa rm ti [02] = tn (ape tg) 


Thus, we can conclude immediately, without having to evaluate /2, that the given integral 
is divergent. 





EXERCISE 14.4 


1. Check the definite integrals given in Exercises 14.3-1 and 14.3-2 to determine whether 
any of them is improper. if improper, indicate which variety of improper integral each 
one is, 

2. Which of the fallowing integrals are improper, and why? 


i 0 
@ { en” dt @ | en dt 
‘0 6 
3 § d 
(6) [ xt de (e) [ part 
1 4 
© [x de of 6d 
JO 3 


3. Evaluate all the improper integrals in Prob. 2. 
4, Evaluate the integral f of Example 5, and show that it is also divergent. 
5, (a) Graph the function y = ce for nonnegative t, {c > 0), and shade the area under 
the curve. 
(B) Write a mathematical expression for this area, and determine whether it is a finite 
area. 


14.5 Some Economic Applications of Integrals 





Integrals are used in economic analysis in various ways. We shall illustrate a few simple 
applications in the present section and then show the application to the Domar growth 
model in Sec. 14.6. 


From a Marginal Function to a Total Function 

Given a total function (c.g., a total-cost function), the process of differentiation can yield 
the marginal function (e.g., the marginal-cost function). Because the process of integration 
is the opposite of differentiation, it should enable us, conversely, to infer the total function 
from a given marginal function. 


Example 1 


Example 2 


Chapter 14 Economie Dynanties and Integral Cadeulus 465 


If the marginal cost (MC) of a firm is the following function of output, C’(Q) = 2¢°2°, and 
if the fixed cost is Cr = 90, find the total-cost function C(Q). By integrating C’(Q) with 
respect to Q, we find that 


[rere dQ =2,!5¢022 += 100? 4 (14.9) 


This result may be taken as the desired C(Q) function except that, in view of the arbitrary 
Constant ¢, the answer appears indeterminate. Fortunately, the information that C; = 90 
can be used as an initial condition to definitize the constant. When Q = 0, total cost C 
will consist solely of Cy. Setting Q = 0 in the result of (14.9), therefore, we should get a 
value of 90; that is, 10¢° + ¢ = 90, But this would imply that c= 90 — 10 = 80. Hence, the 
total-cost function is 


C(Q) = 10272 + 80 


Note that, unlike the case of (14.2}, where the arbitrary constant ¢ has the same value as 
the initial value of the variable H(0), in the present example we have c = 80 but 
C(O) = C, = 90, so that the two take different values. in general, it should not be assumed 
that the arbitrary constant ¢ will always be equal to the initial value of the total function. 


If the marginal propensity to save (MPS) is the following function of income, S'(Y) = 
0.3 -0.1Y7/, and if the aggregate savings $ is nil when income ¥ is 81, find the saving 
function S(¥). As the MPS is the derivative of the 5 function, the problem now calls for the 
integration of $'(Y): 


SY) = [os —0.1¥-") d¥ =03Y -0.2¥?2 4¢ 


The specific value of the constant c can be found from the fact that $= 0 when ¥ = 81. 
Even though, strictly speaking, this is not an initial condition (not relating to Y = 0), substi- 
tution of this information into the preceding integral will nevertheless serve to definitize c. 
Since 


0=0.3(81)-0.29)+¢ 3 c=-225 
the desired saving function is 
SY) =0.3¥ — 0.2¥1? — 22,5 


The technique illustrated in Examples 1 and 2 can be extended directly to other prob- 
lems involving the search for total functions (such as total revenue, total consumption) 
from given marginal functions. Il may alse be reiterated that in problems of this type the va- 
lidity of the answer (an integral) can always be checked by differentiation. 


Investment and Capital Formation 


Capital formation is the process of adding to a given stock of capital. Regarding this 
process as continuous over time, we may express capital stock as a function of time. K(). 
and use the derivative dK /d/ to denote the raic of capital formation," But the rate of capital 


* As a matter of notation, the derivative of a variable with respect to time often is also denoted by a 
dot placed over the variable, such as K = dk /dt, In dynamic analysis, where derivatives with respect 
to time occur in abundance, this more concise symbol can contribute substantially to notational 
simplicity. However, a dot, being such a tiny mark, is easily lost sight of or misplaced; thus, great care 
is required in using this symbol. 


466. Part Five Dynamic Analysis 


Example 3 


formation at time ¢ is identical with the rate of net investment flow al time 1, denoted by 
1(t), Thus, capital stock K and net investment / are related by the following two equations: 


dK, 
ial {t) 
and Kins [roars [ Bar= | ax 


The first of the preceding equations is an identity; it shows the synonymity between net 
investment and the increment of capital. Since /(r) is the derivative of K(f), it stands to 
reason that K(f) is the integral or antiderivative of 7(2), as shown in the second equation. 
The transformation of the integrand in the latter equation is also easy to comprehend: The 
switch from { to dK /dr is by definition, and the next transformation is by cancellation of 
two identical differentials, i.c., by the substitution rule. 

Sometimes the concept of gross investment is used together with that of net investment 
in a model. Denoting gross investment by /, and net investment by /, we can relate them to 
each other by the equation, 


fp =1+6K 


where 6 represents the rate of depreciation of capital and 6X, the rate of replacement 
investment. 


Suppose that the net investment flow is described by the equation {(t) = 3¢'? and that the 
initial capital stock, at time t = 0, is K (0). What is the time path of capital K? By integrating 
/(t) with respect to ¢ we obtain 


Kom frtpar= [30 dt= 2p +c 


Next, letting t = 0 in the leftmost and rightmast expressions, we find Kk (0) = c. Therefore, 
the time path of Kis 


K(f) = 28? + K (0) (14.10) 
Observe the basic similarity between the results in (14.10) and in (14.2), 


The concept of definite integral enters into the picture when one desires to find the 
amount of capital formation during some interval of time (rather than the time path of KX). 
Since f1(¢) dt = K(#), we may write the definite integral 

a b 
[ {(i)dt= xi] = K(6)— K(a) 
a a 
to indicate the total capital accumulation during the time interval [a, b]. Of course, this also 
represents an area under the /() curve. It should be noted, however, that in the graph of the 
K(z) function, this definite integral would appear instead as a vertical distance—more 
specifically, as the difference between the two vertical distances K(h) and K(a). (cf. Exer- 
cise 143-4.) 

To appreciate this distinction between K(t) and /(r) more fully, let us emphasize that 
capital K is a stock concept, whereas investment J is a flow concept. Accordingly, while 
K(t) tells us the amount of K existing at each point of time, /(t) gives us the information 


Example 4 


Example 5 


FIGURE 14.5 


Chapter 14 £vonomie Dynamics and integral Calculus 467 


about the rate of (net) investment per year (or per period of time) which is prevailing at 
each point of time. Thus, in order to calculate the amount of net investment undertaken 
(capital accumulation), we must first specify the length of the interval involved. This fact 
can also be seen when we rewrite the identity dK /dt = I(t) asdK = /(r) dt, which states 
that dK, the increment in K, is based not only on /(s), the rate of flow, but also on df, the 
time that elapsed. It is this need to specify the time interval in the expression 7(1) df that 
brings the definite integral into the picture, and gives rise to the area representation under 
the /(f)—as against the X(t) curve. 





If net investment is a constant flow at f(t) = 1,000 (doilars per year), what will be the total 
het investment (capital formation) during a year, fron t = 0 to t = 1? Obviously, the answer 
is $1,000; this can be obtained formally as follows: 


1 pl 1 
[ 1G) ar= | 1000 = 1,000 = 1,000 
0 0 0 


You can verify that the same answer will emerge if, instead, the year involved is from t = 1 
tot=2. 


{f £() = 3¢1 (thousands of dollars per year)—a nonconstant flow—what will be the capi- 
tal formation during the time interval [1, 4], that is, during the second, third, and fourth 
years? The answer lies in the definite integral 
4 4 
[ 31? dt = 26 =16-2=14 
1 1 
On the basis of the preceding examples, we may express the amount of capital accumu- 
lation during the time interval [0, t], for any investment rate /(t), by the definite integral 


foo dt= Ko] = K()-K() 
0 0 


Figure 14.5 illustrates the case of the time interval [0, tg]. Viewed differently, the preceding 
equation yields the following expression for the time path K(f): 


pt 
Ko= K+ f Hd) dt 


The amount of K at any time ¢ is the initial capital plus the total capital accumulation that 
has occurred since. 


i 


(=i 





Jf Nod = Ria) ~ BW) 





#---------------- 





cad 


468 Part Five Lnamic Anabvsis 


Present Value of a Cash Flow 
Our eatlier discussion of discounting and present value, limited to the case of a single 
future value ¥’, led us to the discounting formulas 


A=VU+)¢ [discrete case] 


and A=Ve" [eantinuous case] 


Now suppose that we have a stream or flow of future values—a series of revenues receiv- 
able at various times or of cost outlays payable at various times, How do we compute the 
present value of the entire cash stream, or cash flow? 

In the discrete case, if we assume three future revenue figures R; (¢ = 1, 2, 3) available 
at the end of the ¢th year and also assume an interest rate of / per annum, the present values 
of R, will be, respectively, 


Rt) Bata?  Ba+i 


It follows that the total present value is the sum 





T= S° Rll +i" (14.11) 
tl 


(I1 is the uppercase Greek letter pi, here signifying present.) This differs from the single- 
value formula only in the replacement of V by X, and in the insertion of the 2 sign. 

The idea of the sum readily carries over to the case of a continuous cash flow, but in the 
latter context the © symbo! must give way, of course, to the definite integral sign. Consider 
a continuous revenue stream at the rate of R(t) dollars per year. This means that at * = 4 
the rate of flow is R(t) dollars per year, but at another point of time ¢ = & the rate will 
be R(%) dollars per year—with f taken as a continuous variable. At any point of time, 
the amount of revenue during the interval [¢,¢+ d¢] can be written as R(t) dz [cf. the 
previous discussion of dK = I(t) di]. When continuously discounted at the rate of r per 
year, its present value should be R(z)e “! dt, If we let our problem be that of finding the 
total present value of'a 3-year stream, our answer is to be found in the following definite 
integral: 


3 


t= [ koe dt (14,11) 
0 


This expression, the continuous version of the sum in (14.11), differs from the singie-value 


formula only in the replacement of V by R(t} and in the appending of the definite integral 


sign,! 


T It may be noted that, whereas the upper summation index and the upper limit of integration are 
identical at 3, the lower summation index 1 differs from the lower limit of integration 0, This is 
because the first revenue in the discrete stream, by assumption, will not be forthcoming until {= 1 
{end of first year), but the revenue flow in the continuous case is assumed to commence immediately 
after f = 0. 


Example 6 


Example 7 


Chapter 14 Economic Dynamics and Integral Caleulus 469 


What is the present value of a continuous revenue flow lasting for y years at the constant 
rate of D dollars per year and discounted at the rate of r per year? According to (14.11'), we 


have 
¥ ¥ -1 y 
n= De" a=of en tt=0[ Te] 
0 0 r lo 


ey 
e"| = Per a2a-em (14.12) 


10 





Thus, TI depends on D,r and y. If D = $3,000, r = 0.06, and y = 2, for instance, we have 
_ 3,000 
~ 0.06 
The value of IT naturally is always positive; this follows from the positivity of D and r, as well 
as (1 -— e-"). (The number e raised to any negative power will always give a positive frac- 
tional value, as can be seen from the second quadrant of Fig. 10.3a.) 


TT (1 — 1) — 50,000¢1 — 0.8869) = $5,655 [approximately] 





In the wine-storage problem of Sec. 10.6, we assumed zero storage cost. That simplifying 
assumption was necessitated by our ignorance of a way to compute the present value of a 
cost flow. With this ignorance behind us, we are now ready to permit the wine dealer to 
incur storage costs. 

Let the purchase cost of the case of wine be an amount C, incurred at the present time. 
Its (future) sale value, which varies with time, may be generally denoted as V(t)—its present 
value being V(te “'. Whereas the sale value represents a single future value (there can be 
only one sale transaction on this case of wine), the storage cost is a stream, Assuming this 
cost to be a constant stream at the rate of s dollars per year, the total present value of the 
storage cost incurred in a total of £ years will amount to 


t 
[ se" dt=20-e) a1) 
0 
Thus the net present value—what the dealer would seek to maximize—can be expressed as 
_ t_ Sq ety ce S) ai SF 
NO = Vine "20 -e")-C = [vos 2] eriiie 
which is an objective function in a single choice variable t. 
To maximize N(¢), the value of t must be chosen such that N‘(£) = 0. This first derivative is 
NW =v (et —r [veo + | e" [product rule] 
=(VW-rv(n—s]e * 
and it will be zero if and only if 
ViiderVi+s 


Thus, this last equation may be taken as the necessary optimization condition for the choice 
of the time of sale ¢*. 

The economic interpretation of this condition appeals easily to intuitive reasoning: V‘(8) 
Tepresents the rate of change of the sale value, or the increment in ¥, if sale is postponed for 
a year, while the two terms on the right indicate, respectively, the increments in the interest 
cost and the storage cost entailed by such a postponement of sale (revenue and cost are 
both reckoned at time ¢*), So, the idea of the equating of the two sides is to us just some “old 
wine in a new bottle,” for it is nothing but the same MC = MR condition in a different guise! 


470 Part Five Dynamic Analysix 


Present Value of a Perpetual Flow 

If a cash flow were to persist forever—a situation exemplified by the interest frorn a per- 
petual bond or the revenue from an indestructible capital asset such as Jand  -the present 
value of the flow would be 


ow 
Hl =| R(DeT" dt 
a 


which is an improper integral. 


Find the present value of a perpetual income stream flowing at the uniform rate of D dol- 
lars per year, if the continuous rate of discount is r. Since, in evaluating an improper inte- 
gral, we simply take the limit of a proper integral, the result in (14.12) can still be of help. 
Specifically, we can write 





aa . ¥ D D 
n= f De "dt = lim f De" dt= im 24 -e N=" 
to Ye fo you F r 


Note that the y parameter (number of years) has disappeared (rom the final answer. This 
is as it should be, for here we are dealing with a perpetual flow. You may also observe that 
our resutt (present value = rate of revenue flow ~ rate of discount) corresponds precisely to 
the familiar formula for the so-called capitalization of an asset with a perpetual yield. 





EXERCISE 14.5 


1. Given the following marginal-revenue functions: 
(0) RQ) = 28Q— eF9 (hy R'(Q) = 1001+ QV? 
find in each case the total-revenue function &(Q). What initial condition can you 
introduce to definitize the constant of integration? 
2. (a) Given the marginal propensity to import M’(Y) = 0.1 and the information that 
M = 20 when Y = 0, find the import function M{Y). 
(b) Given the marginal propensity to consume C’(Y)=0.8+0.1Y"'? and the 
information that C = ¥ when ¥ = 106, find the consumption function C(¥). 
3, Assume that the rate of investment is described by the function /(t) = 12t¥? and that 
K(0) = 25: 
(a) Find the time path of capital stack kK. 
(b) Find the amount of capital accumulation during the time intervals [0, 1] and (7, 3], 
respectively. 
4, Given a continuous income stream at the constant rate of $1,000 per year: 
(a) What will be the present value IT if the income stream lasts for 2 years and the 
continuaus discount rate is 0.05 per year? 
(b) What will be the present value IT if the income stream terminates after exactly 
3 years and the discount rate is 0.047 


§, What is the present value of a perpetual cash flow of: 
(a) $1,450 per year, discounted at r = 5%? 
(8) $2,460 per year, discounted at r = 3%? 


Chapter 14 Econumic Dynamics and Integral Calculus 471 


14.6 Domar Growth Model 


In the population-growth problem of (14.1) and (14.2) and the capital-formation problem 
of (14.19), the common objective is to delineate a time path on the basis of some given pat- 
tem of change of a variable, In the classic growth model of Professor Domar,' on the other 
hand, the idea is to stipulate the type of time path required to prevail if'a certain equilibrium 
condition of the economy is to be satisfied. 





The Framework 
The basic premises of the Domar model are as follows: 


1. Any change in the rate of investment flow per year /(¢) will produce a dual effect: it will 
affect the aggregate demand as well as the productive capacity of the economy. 

2. The demand ¢ffect ofa change in /(¢) operates through the multiplier process, assumed 
to work instantancously, Thus an increase in /{7) will raise the rate of income flow per 
year Y(t) by a multiple of the increment in /(r). The multiplier is & = 1/s, where s 
stands for the given (constant) marginal propensity to save. On the assumption that /(1) 
is the only (parametric) expenditure flow that influences the rate of income flow. we can 
then state that 





d¥ dil 
>—=—- 14.13 
dt dts ( ) 

3, The capacity effect of investment is to be measured by the change in the rate of poten- 
tial output the economy is capable of producing. Assuming a constant capacity-capital 
ratio, we can write 

i 


K 


p (= 4 constant) 


where « (the Greek letter kappa) stands for capacity or potential output flow per year, 
and p (the Greek letter rho) denotes the given capacity-capital ratio. This implies, of 
course, that with a capital stock K (1) the economy is potentially capable of producing 
an annual product, or income, amounting to « = pX dollars. Note thal. from « = pK 
(the production function), it follows that de = p dK, and 


a 8K Lo (14.14 
a Pa? 4) 


In Domar’s model, equilibrium is defined to be a situation in which productive capacity 
is fully utilized. To have cquilibrium is, therefore, to require the aggregate demand to be 
exactly equal to the potential output producible in a year; thal is, Y = c. [f we start initially 
from an equilibrium situation, however, the requirement will reduce to the balancing of the 
respective changes in capacity and in aggregate demand; that is, 


dY de 


— 14.15 
dt dt ( ) 


* Evsey D. Domar, “Capital Expansion, Rate of Growth, and Employment,” Feonometrica, April 1946, 
pp. 137-147; reprinted in Domar, Essays in the Theory of Economic Growth, Oxford University Press, 
Fair Lawn, N.J., 1957, pp. 70-82. 


472 Part Five Dynamic Analysis 


What kind of time path of investment /(#) can satisfy this equilibrium condition at all 
times? 


Finding the Solution 
To answer this question, we first substitute (14.13) and (14.14) into the equilibriym condi- 
tion (14.15). The result is the following differential equation: 
dil 1dl 
fo -=pl -—=p5 14.16) 
pl of = ps (14.16) 
Since (14.16) specifies a definite pattern of change for /, we should be able to find the equi- 
librium (or required) investment path from it. 
In this simple case, the solution is obtainable by directly integrating both sides of the 
second equation in (14.16) with respect to ¢. The fact that the two sides arc identical in equi- 
librium assures the cquality of their integrals. Thus, 


ld 
"aa | psat 
Lae fo 


By the substitution rule and the log rule, the left side gives us 





dl 
Femiita Ue) 
whereas the right side yields (es being a constant) 
fo dt = pst +c) 


Equating the two results and combining the two constants, we have 
In |7| = psite (14.17) 


To obtain |/| from In |/], we perform an operation known as “taking the antilog of In |/4.” 
which utilizes the fact that e"* — x. Thus, letting cach side of (14.17) become the exponent 
of the constant e, we obtain 


girl — plas te) 
or |F[ = eet = Aetst where A = e* 


If we take investment to be positive, then |/| = /, so that the preceding result becomes 
I(t) = Ae?, where A is arbitrary. To get rid of this arbitrary constant, we set f = 0 in the 
equation /(1) = Ae?, to get (0) = de® = A. This definitizcs the constant 4, and enables 
us to express the solution-. the required investment path—as 


I(t) = H(0)e™ (14.18) 
where /(0) denotes the initial rate of investment. 


This result has a disquieting economic meaning, In order to maintain the balance 
between capacity and demand over time, the rate of investment flow must grow precisely 


+ The solution (14.18) will remain valid even if we let investment be negative in the result |/| = Ages, 
See Exercise 14.6-3. 


FIGURE 14,6 


Chapter 14 Economic Dynamics and Integral Calculus 473 


Ho 


A(t) = £(D)eP 


ROY 





at the exponential rate of ps, along a path such as illustrated in Fig. 14.6, Obviously, the 
larger the capacily-capital ratio or the marginal propensity to save, the larger the required 
rate of growth will be. But at any rate, once the values of p and s arc known, the required 
growth path of investment becomes very rigidly set. 


The Razor's Edge 
Tt now becomes relevant to ask what will happen if the actual rate of growth of investment— 
call that rate p—differs from the required rate ps. 

Domar’s approach is to define a coefficient of utilization 


w= lim — [uw = 1 means full utilization of capacity] 


and show that w = r/ ps, so that u 2 lasr 2 ps. In other words, if there is a discrepancy 
between the actual and required rates (r # ps), we will find in the end {as ¢ > oc) cither 
a shortage of capacity (# > 1) or a surplus of capacity (# < 1), depending on whether r is 
greater or less than ps. 

We can show, however, that the conclusion about capacity shortage and surplus really 
applies at any time ¢, not only as ¢ > oo. For a given growth rate y implies that 


di 
i(t)= (Oe and ra ri (O)e" 
d 
Therefore, by (14.13) and (14.14), we have 


dY ldis¢ 
—=- —=-/ (ie 
ad sds (Me 
dx Mt 
a 7 OE = pd (Me 


The ratio between these two derivatives, 
ady/dt or 
dxjdt ps 





should tell us the relative magnitudes of the demand-creating effect and the capacity- 
generating effect of investment at any time ¢, under the actual growth rate of r. If r (the 
actual rate) exceeds py (the required ratc), then dY/dt > de /dt, and the demand effect 
will outstrip the capacity effect, causing a shortage of capacity. Conversely, ifr < ps, there 
will be a deficiency in aggregate demand and, hence, a surplus of capacity. 


474 Part Five Dynamic Analysis 


The curious thing about this conclusion is that if investment actually grows at a faster 
rate than required (r > ps), the end result will be a shortage rather than a surplus of 
capacity. It is cqually curious that if the actual growth of investment lags behind the 
required rate (x < ps}, we will encounter a capacity surplus rather than a shortage. Indeed, 
because of such paradoxical results, if we now allow the entrepreneurs to adjust the actual 
growth rate r (hitherto taken to be a constant) according to the prevailing capacity situation, 
they will most certainly make the “wrong” kind of adjustment. In the case of r > ps, for 
instance, the emergent capacity shortage will motivate an even faster rate of investment. 
But this would mean an increase in r, instead of the reduction called for under the circum- 
stances. Consequently, the discrepancy between the two rates of growth woutd be intensi- 
fied rather than reduced, 

The upshot is that, given the parametric constants p and s, the only way to avoid both 
shortage and surplus of productive capacity is to guide the investment flow ever so care- 
fully along the equilibrium path with a growth rate r* = ps. And, as we have shown, 
any deviation from such a “razor’s edge” time path will bring about a persistent failure to 
satisfy the norm of full utilization which Domar envisaged in this model. This is perhaps 
not too cheerful a prospect to contemplate. Fortunately, morc flexible resulis become pos- 
sible when certain assumptions of the Domar model are moditied, as we shall see from the 
growth model of Professor Salow, to be discussed in Chap, 15. 








EXERCISE 14.6 


1. How many factors of production are explicitly considered in the Dommar model? What 
does this fact imply with regard to the capital-labor ratio in production? 

2. We learned in Sec. 10.2 that the constant rin the exponential function Ae’t represents 
the rate of growth of the function. Apply this to (14.16), and deduce (14.18) without 
going through integration. 

3. Show that even if we let investment be negative in the equation |!| = Ae", upon 
definitizing the arbitrary constant A we will still end up with the solution (14.18). 

4. Show that the result in (14.18) can be obtained alternatively by finding—and 
equating—the definite integrals of both sides of (14.16), 


with respect to the variable t, with limits of integration { = 0 and t = . Remember that 
when we change the variable of integration from t to {, the limits of integration will 
change from t = 0 and t = t, respectively, to f = #(0) and J = i{#). 














Chapter i? 



































Continuous Time: 
First-Order Differential 
Equations 


In the Domar growth model, we have solved a simple differential equation by direct inte- 
gration. For more complicated diflerential equations, there ate various established methods 
of solution, Even in the latter cases, however, the fundamental idea underlying the methods 
of solution is still the techniques of integral calculus. For this reason, the solution to a 
differential equation is often referred to as the integral of that cquation. 

Only first-order differential equations will be discussed in the present chapter. In this 
context, the word order refers to the highest order of the derivatives (or differentials) 
appearing in the differential equation; thus a first-order differential equation can contain 
only the first derivative, say, dy/d/. 


15.1 First-Order Linear Differential Equations with Constant 
Coefficient and Constant Term 





The first derivative dy /dt is the only onc that can appear in a first-order differential equa- 
tion, but it may enter in various powers: dv/dt, (dy/dt)’, or (dy/de)°. The highest power 
attained by the derivative in the equation is referred to as the degree of the differential 
equation. In case the derivative dv/d? appears only in the first degree, and so does the 
dependent variable », and furthermore, no product of the form y(dy/dt) occurs, then the 
equation is said to be /inear. Thus a first-order linear differential equation will generally 


take the form! 
ly 
a +ult)y = w(t) (15.1) 


¥ Note that the derivative term dy/at in (15.1) has a unit coefficient. This is not to imply that it can 
never actually have a coefficient other than one, but when such a coefficient appears, we can always 
“normalize” the equation by dividing each term by the said coefficient. For this reason, the form 
given in (15.1) may nonetheless be regarded as a generaf representation. 


475, 


476 Part Five Dynamic Analysis 


where « and w are two functions of ¢, as is y, In contrast to dy/dt and v, however, no 
restriction whatsoever is placed on the independent variable ¢. Thus the functions w and w 
may very well represent such expressions as ¢? and e' or some more complicated functions 
of ¢; on the other hand, u and w may also be constants. 

This last point leads us to a further classification. When the function u (the coefficient of 
the dependent variable y) is a constant, and when the function w is a constant additive term, 
(15.1) reduces to the special case of a first-order linear differential equation with constant 
coefficient and constant term. In this section, we shall deal only with this simple variety of 
differential equations. 


The Homogeneous Case 

If wand ware constant functions and if w happens to be identically zero, (15.1) will become 
dy 
—+ay=0 15.2; 

at (15.2) 

where @ is some constant. This differential equation is said to be homogeneous on account 

of the zero constant term (compare with homogeneous-equation systems), The defining 

characteristic of a homogeneous ¢quation is that when all the variables (here, dy/dt and y) 

are multiplied by a given constant, the equation remains valid. This characteristic holds if 

the constant term is zero, but will be lost if the constant term is not zero. 

Equation (15,2) can be written alternatively as 


l dy _ 
vy dt 
But you will recognize that the differential equation (14.16) we met in the Domar model is 


precisely of this form. Therefore, by analogy, we should be able to write the solution of 
(5.2) or (15.2) immediately as follows: 





-a (15.2) 


y(t) = de“ [general solution] (15.3) 
or y(t) = pe # [definite solution] (15.3’) 


In (15.3), there appears an arbitrary constant A; therefore it is a general solution. When any 
particular value is substituted for 4, the solution becomes a particular solution of (15.2). 
There is an infinite number of particular solutions, one for each possible value of 4, in- 
cluding the value y(0). This latter value, however, has a special significance: (0) is the 
only value that can make the solution satisfy the initial condition. Since this represents the 
result of definitizing the arbitrary constant, we shall refer to (15.3') as the definite solution 
of the differential equation (15.2) or (15.2). 

You should observe two things about the solution of a differential equation: (]) the solu- 
tion is not a numerical value, but rather a function »{f)-_a time path if fsymbolizes time; and 
(2) the solution y(t) is free of any derivative or differential expressions, so that as soon as a 
specific value of f is substituted into it, a corresponding value of y can be calculated directly. 


The Nonhomogeneous Case 
When a nonzero constant takes the place of the zero in {15.2}, we have a nonhomogeneous 
linear differential equation 


ty 
a tay=b (15.4) 


Example 1 


Chapter 15 Continuous Time: First-Order Differential iquations. 477 


The solution of this equation will consist of the sum of two terms, one of which is called 
the complementary function (which we shall denote by y,), and the other known as the 
particular integral (to be denoted by yy). As will be shown, each of thesc has a significant 
economic interpretation. Here, we shall present only the method of solution; its rationale 
will become clear later. 

Even though our objective is to solve the zonhomogeneous equation (15.4), frequently 
we shall have to refer to its homogencous version, as shown in (15.2). For convenient refs 
crence, we call the latter the reduced equation of (15.4). The nonhomogencous cquation 
(15.4) itself can accordingly be referred to as the complete equation. It turns out that the 
complementary function y, is nothing but the general solution of the reduced equation, 
whereas the particular integral yp is simply amy particular solution of the complete 
equation. 

Our discussion of the homogeneous case has already given us the general solution of the 
reduced equation, and we may therefore write 


yoo de" [by (15,3)] 





What about the particular integral? Since the particular integral is any particular solution 
of the complete equation, we can first try the simplest possible type of solution, namely, y 
being some constant {y = 4). Ify is a constant, then it follows that dy /dr = 0, and (15.4) 
will become ay = 4, with the solution » = b/a. Therefore, the constant solution will work 
as long as a # 0. In that case, we have 





b 
wera {a #0) 


The sum of the complementary function and the particular integral then constitutes the 
general solution of the complete equation (15.4): 


b 
Y= he +h = de“ +— [general solution, case ofa #0) (15.5) 
P a 


What makes this a general solution is the presence of the arbitrary constant A. We may. 
of course, definitize this constant by means of an initial condition, Let us say that takes 
the value y(0) when ¢ = 0, Then, by setting ¢ = 0 in (15.5), we find that 


5 6 
0) = A4— and A= (0)~ — 
a a 
Thus we can rewrite (15.5) into 
b) . . 
¥(2) = | (0) — ale at [dcfinite solution, case ofa # 0] (15.5’) 
a 


It should be noted that the use of the initial condition to detinitize the arbitrary constant 
is—and should be—undertaken as the final step, after we have found the general solution 
to the complete equation. Since the values of both and yp are related to the value of (0). 
both of these must be taken into account in definitizing the constant A. 


Solve the equation dy/dt+ 2y = 6, with the initial condition y(0) = 10. Here, we have 
a =2and b= 6; thus, by (15.5’), the solution is 


yf) = 10+ de 43.5 7e %43 


478 Part Five Dynaznie Anaipsis 


Example 2 


Example 3 


Solve the equation dy/dt+4y = 0, with the initial condition y(0) = 1. Since a= 4 and 
b= 0, we have 

y= -Oe*+0=0% 
The same answer could have been obtained from (15.3’), the formula for the homogeneous 
case. The homogeneous equation (15.2) is merely a special case of the nonhomogeneous 


equation (15.4) when b = 0. Consequently, the formula (15.3’) is also a special case of for- 
mula (15.5') under the circumstance that b = 0. 


What if @ = 0, so that the solution in ( 15.5’) is undefined? In that case, the differential 
equation is of the extremely simple form 
dy 
Sah 15.6 
tr (15.6) 


By straight integration, its general solution can be readily found to be 
yw =btt+e (15.7) 
where ¢ is an arbitrary constant. The two component terms in (15.7) can, in fact, again be 
identified as the complementary function and the particular integral of the given differen- 
tial equation, respectively. Since a = 0, the complementary function can be expressed 
simply as 
ye= de" = Ae =A (A = anarbitrary constant) 
As to the particular integral, the fact that the constant solution » = & fails to work in the 
present case of a = 0 suggests that we should try instead a nonconstant solution. Let us 
consider the simplest possible type of the latter, namely, y = kr. [fp = At, then dy/dt = k, 
and the complete equation (15.6) will reduce to & = 4, so that we may write 
Jp = bt (a =0) 

Our new trial solution indeed works! The general solution of (15.6) is therefore 

Vt) = Yet yp =Atbt — [general solution, case ofa = 0] (15.7) 


which is identical with the result in (15.7), because ¢ and 4 are but alternative notations for 
an arbitrary constant. Note, however, that in the present case, y, is a constant whereas y, is 
a function of time—the exact opposite of the situation in (15,5). 

By definitizing the arbitrary constant, we find the definite solution to be 


y(t) = y(0) + bt [definite solution, case ofa =0] (15.7) 
Solve the equation dy/dt = 2, with the initial condition y(0)= 5. The solution is, by 
(15.7, 
yi) =542t 


Verification of the Solution 
It is true of all solutions of differential equations that their validity can always be checked 
by differentiation. 


Chapter 15 Continuaies Time: First-Order Differential Equations 479 


If we try that on the solution (15.5), we can obtain the derivative 


dy 
oA =-a [o- ie a 


When this expression for dy/dt and the expression for y(t} as shown in (15.5°) are substi- 
tuted into the left side of the differential equation (15.4), that side should reduce exactly 
to the value of the constant term 4 on the right side of (15.4) if the solution is correct. 
Performing this substitution, we indeed find that 


b b b 
~a pa - 7] otha [[-o - 4 eth | =b 
a a a 


Thus our solution is correct, provided it also satisfies the initial condition. To check the 
latter, let us set ¢ = 0 in the solution (15.5’). Since the result 


b b 
(0) = po - Al $= = 910) 
a a 


is an identity, the mitial condition is indeed satisfied. 

It is recommended that, as a final step in the process of solving a differential equation, 
you make it a habit to check the validity of your answer by making sure (1) that the deriv- 
ative of the time path y(¢) is consistent with the given differential equation and (2) that the 
definite solution satisfics the initial condition, 





EXERCISE 15.1 
1. Find yc, yp, the general solution, and the definite solution, given: 
@ Leay=1%4)=2 H+ 1y=15;4o=0 
Y ow) = Ys gy—6 woy—tl 
() F - 2y=0; x0) =9 (dag t4y=6 (0) =15 


2. Check the validity of your answers to Prob. 1. 
3, Find the solution of each of the following by using an appropriate formula developed 
in the text: 


Wig _ dy =? = 
(0) + P= 41 70) =0 (8) Ft BY ZO =4 
Y _ ozo) dy a3. = 
(O) F = 23: 0) =1 © a7 7¥ = 7:0) =7 
ay =0; = dy = 5: = 
() HOY = OKO) =6 () 33 + By = 5; 00) = 0 


4, Check the validity of your answers to Prob. 3. 


15.2 Dynamics of Market Price 








In the {macro} Domar growth model, we found an application of the Aumogeneous casi 
linear differential equations of the first order, To illustrate the nonhomogeneous case. lel us 
present a (micro) dynamic model of the market, 


480. Part Five Dynamic Analysis 


The Framework 
Suppose that, for a particular commodity, the demand and supply functions are as follows; 


Qu =a- PP (a, B > 0) 





Q=-y+5P (5 (158) 
Then, according to (3.4), the equilibrium price should bet 
Pr= — (= some positive constant) (15.9) 


If it happens that the initial price P(0) is precisely at the level of P*, the market will clearly 
be in equilibrium already, and no dynamic analysis will be nocded. In the more interesting 
case of P(0) # P*, however, P* is attainable (if ever) only after a due process of adjust- 
ment, during which not only will price change over time but Q, and Q,, being functions of 
P, must change over time as well. In this light, then, the price and quantity variables can aff 
be taken to be functions of time. 

Our dynamic question is this: Given sullicient time for the adjustment process to work 
itself out, docs it tend to bring price to the equilibrium level P*? That is, does the time path 
P(t) tend to converge to P*, ast — 00? 


The Time Path 
To answer this question, we must first find the time path P(/). But that, in turn, requires a 
specific pattern of price change to be prescribed first. In general, price changes arc gov- 
erned by the relative strength of the demand and supply forces in the market. Let us assume, 
for the sake of simplicity, that the rate of price change (with respect to time) at any moment 
is always directly proportional to the excess demand (Qa — Qs) prevailing at that moment. 
Such a pattern of change can be expressed symbolically as 
aP 
dt 
where represents a (constant) adjustment coefficient. With this pattern of change, we can 
have dP/dt = 0 if and only if Oy = Q,. In this connection, it may be instructive to note 
two senses of the term equilibrium price: the intertemporal sense (P being constant over 
time) and the markct-clearing sense (the equilibrium price being one that equates Qy and 
O,). In the present model, the two sens¢s happen to coincide with each other, but this may 
not be true of all models. 
By virtue of the demand and supply functions in {15.8}, we can express (15.10) specifi- 
cally in the form 


HOa-Q) (YF > 9) (15.10) 


dP 
Gy a Hla~ BP + y— BP) = flat y)— B+ AP 
or 
dP 
+ iB+0P = jla+y) (15.10) 


* We have switched from the symbols (a, 8, ¢, d) of (3.4) to (a, B, y, 8) here to avoid any possible 
confusion with the use of a and b as parameters in the differential equation (15.4) which we shall 
presently apply to the market model. 


FIGURE 15.1 


Chapter 15 Continuous Time: First-Order Differential Equations 481 


Since this is precisely in the form of the differential equation (15.4), and since the coeffi- 

cient of P is nonzero, we can apply the solution formula (15.5) and write the solution—the 
time path of pricee—as 

aty]_ at+y 

Pa) — | Pid) — pray, OF 

a [ro role B48 

=[P(O)— Ye + P* [by (15.9) ks (GB +5] (15.11) 


The Dynamic Stability of Equilibrium 

In the end, the question originally posed, namely, whether P(t} > P* ast > oo, amounts 
to the question of whether the first term on the right of (15.11) will tend to zero as > 00. 
Since P(O) and P* are both constant, the key factor will be the exponential expression 
e~. In view of the fact thal & > 0, that expression does tend to zero as / > 00. Conse- 
quently, on the assumptions of our model, the time path will indeed lead the price toward 
the equilibrium position. In a siluation of this sort, where the time path of the relevant vari- 
able P(z) coaverges to the level P* interpreted here in its role as the intertemporal (rather 
than market-clearing) equilibrium—the equilibrium is said to be dynantically stable. 

The concept of dynamic stability is an important one. Let us examine it further by a 
more detailed analysis of (15.11). Depending on the relative magnitudes of P(0) and P*, 
the solution (15.11) really encompasses three possible cases. The first is P(O) = P*, which 
implies P(t) = P*. In that event, the time path of price can be drawn as the horizontal 
straight line in Fig. 15.1. As mentioned earlier, the attainment of equilibrium is in this case 
a fait accompli. Second, we may have P(0) > P*. In this case, the first term on the right of 
(15.11) is positive, but it will decrease as the increase in ¢ lowers the value of ¢° “. Thus the 
time path will approach the equilibrium level P* from above. as illustrated by the top curve 
in Fig. 15.1. Third, in the opposite case of P(0) < P*, the equilibrium level P* will be 
approached from betow, as illustrated by the bottom curve in the same figure. In general, 
to have dynamic stability, the deviation of the time path from equilibrium must cither be 
identically zero (as in case 1) or steadily decrease with time (as in cases 2 and 3). 

A comparison of (15.11) with (15.5’) tells us that the P* term, the counterpart of b/a, 
is nothing but the particular integral yp, whereas the exponential term is the (definitized) 
complementary function y.. Thus, we now have an economic interpretation for y, and 
Ypi Yp Tepresents the intertemporal equilibrium level of the relevant variable, and y, is the 
deviation from equilibrium. Dynamic stability requires the asymptotic vanishing of the 
complementary function as ¢ becomes infinite, 











PO 





PCO) 
Pt): case of Pi) = P* 








P(r) case of PO) <= P* 





482 Part Five Dynamic Analysis 


FIGURE 15.2 


In this model, the particular integral is a constant, so we have a stationary equilibrium 
in the intertemporal sense, represented by P*, If the particular integral is nonconstant, as in 
(15.73, on the other hand, we may interpret it as a moving equilibrium. 


An Alternative Use of the Model 
What we have done in the preceding is to analyze the dynamic stability of equilibrium (the 
convergence of the time path), given certain sign specifications for the parameters. An al- 
ternative type of inquiry is: In order to cnsure dynamic stability, what specific restrictions 
must be imposed upon the parameters? 

The answer to that is contained in the solution (15.11). If we allow P(0) # P*, we see 
that the first (y,) term in (15.11) will tend to zero as t + 0 if'and only if 4 > 0—that is, 
if and only if 


HB+8) > 0 


Thus, we can take this last inequality as the required restriction on the parameters j (the ad- 
justment coefficient of price), 8 (the negative of the slope of the demand curve, plotted with 
OQ on the vertical axis), and 8 (the slope of the supply curve, plotted similarly). 

In case the price adjustment is of the “normal” type, with / > 0, so that excess demand 
drives price up rather than down, then this restriction becomes merely (8 + 5) > 0 or, 
equivalently, 

b>-f 
To have dynamic stability in that event, the slope of the supply must exceed the slope of the 
demand. When both demand and supply are normally sloped (—8 <0, 8 > 0), as in 
(15.8), this requirement is obviously met. But even if onc of the curves is sloped 
“perversely,” the condition may still be fulfilled, such as when 6 = 1 and —B = 1/2 (posi- 
tively sloped demand). The latter situation is illustrated in Fig, 15.2, where the equilibrium 
price P* is, as usual, determined by the point of intersection of the two curves. If the initial 
price happens to be at Pi, then Q, (distance P| G) will exceed QO, (distance P, F°), and the 
excess demand (FG) will drive price up. On the other hand, if price is initially at P2, then 


Q 











Chapter 15 Continuous Time: First-Order Differential Equations 483 


there will be a negative excess demand MN, which will drive the price down. As the (wo ar- 
tows in the figure show, therefore, the price adjustment in this case will be award the equi- 
librium, no matter which side of P* we start from. We should emphasize. however, that 
while these arrows can display the direction, they are incapable of indicating the magnitude 
of change. Thus 15,2 is basically static, not dynamic, in nature, and can serve only to 
illustrate, not to replace, the dynamic analysis presented. 








EXERCISE 15.2 


1. If both the demand and supply in Fig. 15.2 are negatively sloped instead, which curve 
should be steeper in order to have dynamic stability? Does your answer conform to the 
criterion § > -B? 

2. Show that (15.10') can be rewritten as dP /dt+ k(P — P*) = 0. if we let P— Pt = 
(signifying deviation), so that dA/dt = dP /dt, the differential equation can be further 
rewritten as 
a +ka=0 
Find the time path A(Q), and discuss the condition for dynamic stability. 

3. The dynamic market model discussed in this section is closely patterned after the static 
one in Sec. 3.2, What specific new feature is responsible for transforming the static 
model into a dynamic one? 

4, Let the demand and supply be 
Qunu- pr +o ae Q=-ythP A, y,8> 0) 

(a) Assuming that the rate of change of price over time is directly proportional to the 
excess demand, find the time path P(t (general solution). 

() What is the intertemporal equilibrium price? What is the market-clearing equilib- 
rium price? 

(cq) What restriction on the parameter o would ensure dynamic stability? 

5. Let the demand and supply be 
Qua apr nae Qs =8P (a, Ayn, 5 > 0) 

(0) Assuming that the market is cleared at every point of time, find the time path P(t) 
(general solution). 

(6) Does this market have a dynamically stable intertemporal equilibrium price? 

(©) The assumption of the present model that Qg = Q, for ail tis identical with that of 
the static market model in Sec. 3:2. Nevertheless, we stil have a dynamic model 
here. How come? 


15.3 Variable Coefficient and Variable Term 





In the more general case of a first-order linear differential equation 


dy. ay 
a Oy = (15,12) 


484 Part Five Dynamic Analysis 


Example 1 


u(t) and w(f) represent a variable coelficient and a variable term, respectively. How do we 
find the time path y(t) in this case? 


The Homogeneous Case 
For the homogeneous case, where w(/) = 0, the solution is still easy to obtain. Since the 
differential equation is in the form 


dy 1 dy 
= Oy =0 ->=-u(t 15.13 
at Moy od u(t) (15.13) 
we have, by integrating both sides in turn with respect to ¢, 
l dy 
Left side = [; & = [° =Inyte — (assuming y > 0} 
pdt y 


Right side = [-w dt=— fuo dt 


tn the latter, the integration process cannot be carried further because u(t) has not been 
given a specific form; thus we have to settle for just a general integral expression. When the 
two sides are equated, the result is 


Iny = -e= fas dt 


Then the desired y path can be obtained by taking the antilog of In y: 
x) = eM’ ote fan de gy fue ae where A=e“ (15.14) 





This is the general solution of the differential equation (15.13). 

To highlight the variable nature of the coefficient u(t), we have so far explicitly written 
out the argument #. For notational simplicity, however, we shall from here on omit the 
argument and shorten u(/) tov. 

As compared with the general solution (15,3) for the constant-coefficient case, the only 
modification in (15.14) is the replacement of the e~“ expression by the more complicated 


expression é J" The rationale behind this change can he better understood if we inter- 
pret the af term in e~* as an integral: fa dt = at (plus a constant which can be absorbed 
into the 4 term, since ¢ raised to a constant power is again a constant). In this light, the dif- 
ference between the two general solutions in fact turns into a similarity. For in both cases 
we are taking the coefficient of the p erm in the differential equation—a constant term a in 
one case, and a variable term u in the other—and integrating that with respect to f, and then 
taking the negative of the resulting integral as the exponent of e. 

Once the general solution is obtained, it is a relatively simple matter to get the definite 
solution with the help of an appropriate initial condition. 
ay 
dt 
judt = [30 dt =6 +c. Therefore, by (15.14), we may write the solution as 


Find the general solution of the equation +3ty=0. Here we have w= 3/2, and 


y(t) = Ae ®-9 = Ao’, °= Bet where B = Ae“ 


Observe that if we had omitted the constant of integration c, we would have lost no 
information, because then we would have obtained y(t) = Ae", which is really the identi- 
cal solution since A and 8 both represent arbitrary constants. In other words, the expression 
e-‘, where the constant ¢ makes its only appearance, can always be subsumed under the 
other constant A. 


Example 2 


Example 3 


Chapter 15 Continuous Time: First-Order Differential Equations 485 


The Nonhomogeneous Case 


For the nonhomogeneous case, where w(t) ¥ 0, the solution is not as easy to obtain. We 
shall try to find that solution via the concept of exact differential cquations, to be discussed 
in Sec, 15.4, It does no harm, however, to state the result here first; Given the differential 
equation (15,12), the general solution is 


y(O= eo fuat (4 + wed st a) (15.15) 


where A is an arbitrary constant that can be definitized if we have an appropriate initial 
condition. 

It is of interest that this general solution, like the solution in the constant-coefficient 
constant-term case, again consists of two additive components, Furthermore, one of these 
two, ded "ig nothing but the general solution of the reduced (homogeneous) equation, 
derived earlier in (15.14), and is therefore in the nature of a complementary function. 


dy 


Find the general solution of the equation a 


+ 2ty = t. Here we have 


u=2t we=t and [oct =ttk (k arbitrary) 


Thus, by (15.15), we have 
yO = el hk) (4 + [ree at) 


aotetk (4 + é freer) 


=Aetet 4 ef Ge" +) fete’ = 1] 
1 
= (Ae* + oet + 3 


1 
= bey 5 Where B= Ae™* + cis arbitrary 


The validity of this solution can again be checked by differentiation. 

[tis interesting to note that, in this example, we could again have omitted the constant 
of integration k, as well as the constant of integration ¢, without affecting the fina! outcome. 
This is because both k and c may be subsumed under the arbitrary constant 8 in the final 
solution. Yau are urged to try out the simpler process of applying (15.15) without using the 
constants k and c, and verify that the same solution will emerge. 


a 
Solve the equation a + 4ty = 4t. This time we shall omit the constants of integration. 
Since 


u=4t w=4t and fo dt=2t? — [constant omitted] 
the general solution is, by (15.15), 
y(t) = ev (A + [e*ar) =e? (A + ent) [constant omitted] 


= Ae? 44 


4B6 Part Five 


Dynanaic Analysis 


As may be expected, the omission of the constants of integration serves to simplify the pro- 
cedure substantially. 

The differential equation a +uy=w in (15.12) is more general than the equation 
dy 


h +ay=b in (15.4), since u and w are not necessarily constant, as are @ and b. Accord- 


ingly, solution formula (15.15) is also more general than solution formula (15.5). In fact, 
when we set u = a and w = b, (15.15) should reduce to (15.5). This is indeed the case. For 
when we have 


u=a w=b and / udt =at [constant omitted] 
then (15.15} becomes 
b 
yQ=e* (A + [ves at) =e (4 + ze) [constant omitted] 


= Ae e+ b 
a 


which is identical with (15.5). 





EXERCISE 15.3 


Solve the following first-order linear differential equations; if an initial condition is given, 
definitize the arbitrary canstant: 


1. Y syns 

2. os ay =0 

3. Ds 2ty 2 xO) = 5 

4, Os py ase yo) =6 

5, 24 4 12y +26 =0; 10) = 5 
6 Vayat 


15.4 Exact Differential Equations 





We shall now introduce the concept of exact differential equations and use the solution 
method pertaining thereto to obtain the solution formula (15.15) previously cited for the dif- 
ferential equation (15.12). Even though our immediate purpose is to use it to solve a linear 
differential equation, an exact differential equation can be either linear or nonlinear by itself. 


Exact Differential Equations 
Given a function of two variables F(y, 1), its total differential is 


ar oF 
dF, = det 
(8) ay yt a di 


Chapter 15 Continuous Tine: First-Order Differential Equations 487 


When this differential is set equal to zero, the resulting equation 


is known as an exact differential equation, because its left side is exactly the differential of 
the function F(y, £). For instance, given 


Fly, =yt+h (4 a constant) 
the total differential is 
dF = 2Iytdy + y'dt 


thus the differential equation 


Qytdy+dt=0 or ay £4 (15.16) 
vee . 7 dt 2yt 7 . 
is exact. 
In general, a differential equation 
Mdy+Ndt=0 (15.17) 


is exact if and only if there exists a function F{y, 4) such that M = @F/dy and N= 
@F/dt. By Young’s theorem, which states that °F /dr ay = °F fay ar, however, we can 
also state that (15.17) is exact if and only if 

aM aN 

at ay (15.18) 
This last equation gives us a simple test for the exactness of a differential cquation. Applied 
to (15.16), where M = 2yr and N = y’, this test yields 9M/d¢ = 2y = AN/ay; thus the 
exactness of the said differential equation is duly verified. 

Note that no restrictions have been placed on the terms M and N with regard to the man- 
ner in which the variable y occurs. Thus an exact differential equation may very well be 
nonlinear (in y). Nevertheless, it will always be of the first order and the first degree. 

Being exact, the differential equation merely says 


dF(y,t) =0 
Thus its general solution should clearly be in the form 
F(y,th=e 
To solve an exact differential equation is basically, therefore, to scarch for the (primitive) 


function F(y, 7) and then set it equal to an arbitrary constant. Let us outline a method of 
finding this for the equation M dy + N dt =0. 


Method of Solution 


To begin with, since M = aF/ay, the function F must contain the integral of Mf with re- 
spect to the variable y; hence we can write out a preliminary result—in a yet indeterminate 
form—as follows: 


Foun = [may +440 (15.19) 


488 Part Five Dynamic Analysis 


Example 1 


Hete M, a partial derivative, is to be integrated with respect to y only; that is, 7 is to be 
treated as a constant in the integration process, just as it was treated as a constant in the par- 
tial differentiation of F(y, £) that resulted in M = 0 #/dy.* Since, in differentiating F(y, 0) 
partially with respect to y, any additive term containing only the variable ¢ and/or some con- 
stants (but with no y} would drop out, we musi now take care to reinstate such terms in the 
integration process. This explains why we have introduced in (15.19) a general term ¥(), 
which, though not exacily the same as a constant of integration, has a preciscly identical 
role to play as the latter. It is relatively casy to get JM dy; but how do we pin down the 
exact form of this #(t) term? 

The trick is to utilize the fact that. V = dF /a¢. But the procedure is best explained with 
the help of specific examples. 
Solve the exact differential equation 

2ytdy+y? dt=0 — [reproduced from (15.16)] 
In this equation, we have 
M=2yt and N=y* 


Stee i By (15.19), we can first write the preliminary result 
Pn O= | ddr WO = V+ HCD 


Note that we have omitted the constant of integration, because it can automatically be 
merged into the expression y(t). 


Step ii If we differentiate the result from Step | partially with respect to f, we can obtain 
oF 2 1 
= ¢ 
wy +H 
But since N = dF /at, we can equate N= y” and aF /at = y? + #'(B, to get 
wih =0 
Step iii Integration of the last result gives us 


HD = [wo at= foat=t 


and now we have a specific form of (é). It happens in the present case that ¥(¢) is simply 
a constant; more generally, it can be a nonconstant function of f. 


Sten iv The results of Steps i and iii can be combined to yield 
Fy, D=yettk 


The solution of the exact differential equation should then be Fy, t) = c. But since the con- 
stant k can be merged into ¢, we may write the solution simply as 


yt=c or yao? 
where c is arbitrary. 


* Some writers employ the operatar symbol /(---) dy to emphasize that the integration is with respect 
to y only, We shall still use the symbol f(---) dy here, since there is little possibility of confusion 


Example 2 


Example 3 


Chapter 15 Consinuous Time: First-Order Differential Equations AB9 


Salve the equation (t+ 2y) dy+(y+ 30) dt =0. First let us check whether this is an 
exact differential equation. Setting M=t+2y and N= y+ 3¢2, we find that aM/at= 
1 =4@N/dy, Thus the equation passes the exactness test. To find its solution, we again 
follow the procedure outlined in Example 1. 


Step i Apply (15.19) and write 
Fy,0= / (t+ 2y) dyt+ w= yt+y?+ WO — [constant merged into (s)] 
Step ii Differentiate this result with respect to f, to get 
OF ‘ 
—= E 
ap nyt) 
Then, equating this to N = y + 322, we find that 
w@ =3? 
Step ili Integrate this last result to get 
wt) = pe dt=t — [constant may be omitted] 
Ster iv Combine the results of Steps i and iii to get the complete form of the function 
Fy, 0: 
Fly, Q=ytty? +08 
which implies that the solution of the given differential equation is 
yity2+Pac 
You should verify that setting the total differential of this equation equal to zero will indeed 


produce the given differential equation. 


This four-step procedure can be used to solve any exact differential equation. Interest- 
ingly, it may even be applicable when the given equation is zor exact, To see this, however, 
we must first introduce the concept of integrating factor. 


Integrating Factor 
Sometimes an inexact differential equation can be made exact by multiplying every term of 
the equation by a particular common factor. Such a factor is called an integrating factor 


The differential equation 
2tdyt+tydt=0 
is not exact, because it does not satisfy (15.18): 
aM a aN a 
— = (2H) = 24 — = —(y)=1 
a HCO =P A a= 


However, if we multiply each term by y, the given equation will turn into (15.16), which has 
been established to be exact. Thus y is an integrating factor for the differential equation in 
the present example. 


490 Part Five Dynamic dnulysis 


When an integrating factor can be found for an inexact differential equation, il is always 
possible to render it exact, and then the four-step solution procedure can be readily put 
to use. 

Solution of First-Order Linear Differential Equations 


The gencral first-order linear differential equation 
dy 


awa 


which, in the format of (15.17), can be expressed as 
dy +(uy —w) dt =0 (15.20) 


ed" = exp (/ u a) 


This integrating factor, whose form is by no means intuitively obvious, can be “discoy- 
ered” as follows. Let J be the (yet unknown) integrating factor. Multiplication of’ (15.20) 
through by J should conyert it into an exact differential cquation 

7 dy+Huy—w) dr =0 
ne ace 
M N 


has the integrating factor 


(15.20’) 


The exactness test dictates that 2M@/d1 = @N/dy. Visual inspection of the M and N 
expressions suggests that, sittce M consists of / only, and since w and w are functions of f 
alone, the exactness test will reduce to a very simple condition if / is also a function of 
falonc. For then the test 44/01 = aN/dy becomes 

dl dlfdt 

—-=Tu or — =u 

at I 
Thus the special form / = /(t) can indeed work, provided it has a rate of growth cqual to 
u, or more explicitly, #(#). Accordingly, /(t) should take the specific form 

1(t)= del" [of 15.13) and (15.14)] 

As can be easily verified, however, the constant A can be set equal to | without affecting the 


ability of /(¢) to meet the exactness test. Thus we can use the simpler form od a6 the 
integrating factor. 
Substitution of this integrating factor into (15.20°) yields the exact differential equation 


oft dy el" (uy — w) dt =0 (15.20”) 
which can then be solved by the four-step procedure. 


Step i First, we apply (15.19) ta obtain 
FYyD= fee dy + wt = vel" + wl 


The result of integration emerges in this simple form because the integrand is independent 
of the variable y. 


Chapter 15 Continuous Time: First-Order Differential Equations 491 


Step ii Next, we differentiate the result from Step | with respect to ¢, to get 
a = yuel’4.W() [chain rule] 
And, since this can be equated to N = Leb iyy— w), we have 
YO = welt 
Sttp iii Straight integration now yields 
w= — f wel at 


Inasmuch as the functions u = u(t) and w = w(t) have not been given specific forms, noth- 
ing further can be done about this integral, and we must be contented with this rather 
general expression for y(t). 


Step iv Substituting this #(¢) expression into the result of Step i, we find that 
Fly, = yel vt — / wel’ a 


So the general solution of the exact differential equation (15.20")—and of the equivalent, 
though inexact, first-order tinear differential equation (15.20)—is 


pelt — wel stat - c 


Upon rearrangement and substitution of the (arbitrary constant) symbol c by A, this can be 
written as 


ny=e le (as [wale ar) (15.21) 


which is exactly the result given earlier in (15.15). 





EXERCISE 15.4 
1. Verify that each of the following differential equations is exact, and solve by the 
four-step procedure: 
(a) 2yt? dy + 3y7"? dt =0 
(b) 3y’edy +(y? +20) dt=0 
(9 th+2ydy+y +ydt=a 


dy 2y4t +38 wg 
(d) at ae =0 — [Hint: First convert to the form of (15.17).] 
2. Are the following differential equations exact? if not, try ¢, y, and y* as possible 
integrating factors. 


(a) 2¢t? +1) dy+ 3y@dt=0 
() 4y3t dy + (2y4 +30 dt =0 

3, By applying the four-step procedure to the general exact differential equation 
M dy+ Ndt=0, derive the following formula for the general solution of an exact 
differential equation: 


[riz fvar~ J (5 frray) ae=c 


492 Part Five Dynamic Analysis 


15.5 Nonlinear Differential Equations 
of the First Order and First Degree 





Example 1 


Ina Linear differential equation, we restrict to the first degree not only the derivative dy/dr, 
but also the dependent variable y. and we do not allow the product y(dy/ds) to appear. 
When y appears in a power higher than one, the equation becomes nontinear even if it only 
contains the derivative dy/dt in the first degree. In general, an equation in the form 


FU O dy + elyt) dt =0 (15.22) 

or 
dy , 
owed (15.22’) 


where there is no restriction on the powers of y and f, constitutes a first-order first-degree 
nonlinear differential equation because dy/d¢ is a first-order derivative in the first power. 
Certain varietics of such equations can be solved with relative ease by more or less routine 
procedures. We shall briefly discuss three cases, 






Exact Differential Equations 
The first is the now-familiar case of exact differential equations. As was pointed out earlier, 
the y variable can appear in an exact equation in a high power, as in (15,16) 2yrdy + 
y? df = 0—which you should compare with (15.22). True, the cancellation of the common 
factor y from both terms on the left will reduce the equation to a lincar form, but the exact- 
ness property wilt be lost in that event. As an exact differential equation, therctore, it must 
be regarded as nonlinear. 

Since the solution method for exact differential equations has already been discussed, 
no further comment is necessary here. 


Separable Variables 
The differential equation in (15.22) 

AUD dy + 80,0 dt =0 
may happen to possess the convenient property that the function fis in the variable y alone, 
while the function g involves only the variable /, so thal the cquation reduces to the special 
form 


f(y) dy + g(thdt =0 (15.23) 

In such an event, the variables are said to be separable, because the terms involving y— 
consolidated into f(y} can be mathematically separated from the terms involving #, 
which are collected under g(£). To solve this special type of equation, only simple integra- 
tion techniques are required. 

Solve the equation 3y? dy — tdt = 0. First let us rewrite the equation as 

3y*dy=tat 
Integrating the two sides (each of which is a differential) and equating the results, we get 


[ray= feat or yta = fete 


Example 2 


Chapter 15 Continuous Time: First-Order Differential Equctions 493 


Thus the general solution can be written as 


3 1 2 1 2 V3 
= <t = 
Post +e or yO G +e) 


The notable point here is that the integration of each term is performed with respect to 
a different variable; it is this which makes the separable-variable equation comparatively 
easy to handle. 


Solve the equation 2¢ dy + y dé = 0. At first glance, this differential equation does not seem 
to belong in this spot, because it fails to conform to the general form of (15.23). To be 
specific, the coefficients of dy and dt are seen to involve the “wrong” variables. However, a 
simple transformation—dividing through by 2yt (4 0)—will reduce the equation to the 
separable-variable form 

1 1 

; dy+ = at=0 


From our experience with Example 1, we can work toward the solution (without first trans- 
posing a term) as follows:' 
1 P| 
—dy+ | sj, dt=c 
[pot 


1 ‘ 
so Iny+5iInt=c or Intytl’4) 





Thus the solution is 
yl?=&=k or y(akel? 


where k is an arbitrary constant, as are the symbols c and A employed elsewhere. 


Note that, instead of solving the equation in Example 2 as we did, we could also have 
transformed it first into an exact differential cquation (by the integrating factor y) and then 
solved it as such. The solution, already given in Example | of Sec. 15.4, must of course be 
identical with the onc just obtained by separation of variables. The point is that a given dif 
ferential equation can often be solvable in more than one way, and therefore onc may have a 
choice of the method to be used. In other cases, a differential equation that is not amenable 
toa particular method may nonetheless become so after an appropriate transformation. 


Equations Reducible to the Linear Form 
Ifthe differential equation dy/dt = h(y, ¢) happens to take the specific nonlinear form 
dy 
oe Ry= Ty" (15.24) 
where & and 7 are two functions of ¢ and m is any number other than 0 and 1 (what if 


m = 0 orm = 17), then the equation referred to as a Bernoulli equation—can always be 
reduced to a linear differential equation and be solved as such. 


"In the integration result, we should, strictly speaking, have written In |y| and 3 In el, Ify and tean 
be assumed to be positive, as is appropriate in the majority of economic contexts, then the result 
given in the text will occur, 


494 Part Five Dynamic Analysis 


Example 3 


Example 4 


The reduction procedure is relatively simple. Furst, we can divide (15.24) by y"", to get 


ny tm 
yp Ry = 7 
, dt +% 


if we adopt a shorthand variable z as follows: 


dz 
ra yin ae ay age 
¥ [» that di dy di (l-m)y 


then the preceding equation can be written as 


Lodz 


— {ik =T 
l-m a’ 7 


Moreover, after multiplying through by (1 — m) df and rearranging, we can transform the 
equation inte 


dz+[(1-m)Rz-(—m)T]dt=0 (15.24’) 


This is seen to be a first-order linear differential equation of the form (15.20), in which the 
variable z has taken the place of y. 

Clearly, we can apply formula (15.21) to find its solution z(z). Then, as a final step, we 
can translate z back to y by reverse substilution. 


Solve the equation dy/dt + ty = 3ty?. This is a Bernoulli equation, with m= 2 (giving us 
zaylM=y!), R=t, and T = 3t, Thus, by (15.249, we can write the linearized differ- 
ential equation as 


dz+(-tz+ 2) dt =0 
By applying formula (15.21), the solution can be found to be 
z(0= Ae (32) +3 


(As an exercise, trace out the steps leading to this solution.) 

Since our primary interest lies in the solution y (f rather than z(t), we must perform a 
reverse transformation using the equation z= y~!, or y = z-!. By taking the reciprocal of 
2 (8), therefore, we get 





70" Fep(t) +3 


as the desired solution. This is a general solution, because an arbitrary constant A is present. 


Solve the equation dy/dt + (1/t)y = y*. Here, we have m= 3 (thus z= y-2), R = 1/t, and 
T =1; thus the equation can be linearized into the form 


dz+ (e+ 2) dt =0 
As you can verify, by the use of formula (15.21), the solution of this differential equation is 
2(t) = At? + 2t 


Chapter 15 Continuous Time: First-Order Differential Equations 495 


It then follows, by the reverse transformation y = 2~'?, that the general solution in the 
original variable is to be written as 


yi) = (AP + 28°12 


As an exercise, check the validity of the solutions of these last two examples by 
differentiation. 





EXERCISE 15.5 


1. Determine, for each of the following, (1) whether the variables are separable and (2) 
whether the equation is linear or else can be linearized: 


dy t 

2tdy+2ydt=0 Se 
(a) 2t dy + 2y dt OF y 
y at dys 

() pri Yt ite a) G7 3y°t 


2. Solve (a) and (6) in Prab. 1 by separation of variables, taking y and t to be positive. 
Check your answers by differentiation. 

3. Solve (¢) in Prob. 1 as a separable-variable equation and, also, as a Bernouili equation. 

4. Solve (d) in Prob. 1 as a separable-variable equation and, also, as a Bernoulli equation. 


5. Verify the correctness of the intermediate solution 2(f) = At* +2¢ in Example 4 by 
showing that its derivative dz/dt is consistent with the linearized differential equation. 


15.6 The Qualitative-Graphic Approach 





The several cases of nonlinear differential equations previously discugsed (exact differen- 
tial equations, separable-variable equations, and Bernoulli equations) have all been solved 
quanitatively. That is, we have in every case sought and found a time path ¥(t) which. for 
each valuc of ¢, tells the specific corresponding value of the variable y, 

At limes, we may not be able to find a quantitative solution from a given differential 
equation. Yet, in such cases, it may nonetheless be possible to ascertain the qualitative 
properties of the lime path--primarily, whether y(2) converges—by directly observing the 
differential equation itself or by analyzing its graph. Even when quantitative solutions are 
available, moreover, we may still employ the techniques of qualitative analysis if the qual- 
itative aspect of the time path is our principal or exclusive concern. 








The Phase Diagram 


Given a first-order differential equation in the general form 


dy 

—=fly 
at fv) 
either linear or nonlinear in the variable », we can plot dy/d¢ against y as in Fig. 15.3. Such 
a geometric representation, feasible whenever dy/dt is a function of y alone, is called a 
phase diagram, and (he graph representing the function fa phase fine. (A differential cqua- 
tion of this form—in which the time variable ¢ does not appear as a separate argument of 


496 Part Five Denamic Analysis 


FIGURE 15.3 


dy 
dt 





the function fis said to be an autonomous differential equation.) Once a phase linc is 
known, its configuration will impart significant qualitative information regarding the time 
path y(7). The clue to this lies in the following two general remarks: 


L. Anywhere above the horizontal axis (where dp/dt > 0), vy must be increasing over time 
and, as far as the y axis is concerned, must be moving from left to right. By analogous 
reasoning, any point below the horizontal axis must be associated with a leftward move- 
ment in the variable y, because the negativity of dy/d1 means that y decreases over time. 
These directional tendencies explain why the arrowheads on the illustrative phase fincs 
in Fig, 15,3 are drawn as they are. Above the horizontal axis, the arrows are uniformly 
pointed toward the right—toward the northeast or southeast or due east, as the case may 
be. The opposite is true below the y axis. Moreover, these results are independent of the 
algebraic sign of y; even if phase line 4 (or any other) is transplanted to the left of the 
vertical axis, the direction of the arrows will not be affected. 

2. An equilibrium level of y—in the intertemporal sense of the term if it exists, can occur 
only on the horizontal axis, where dy/dt = 0 (y stationary over time), To find an equi- 
librium, therefore, it is necessary only to consider the intersection of the phase line with 
the y axis. To test the dynamic stability of equilibrium, on the other hand, we should 
also check whether, regardless of the initial position of y, the phase line will always 
guide it toward the equilibrium position at the said intersection. 


Types of Time Path 
On the basis of the preceding general remarks, we may observe three different types of time 
path from the illustrative phase lines in Fig, 15.3. 

Phase line 4 has an equilibrium at point y,; but above as well as be/ow that point, the 
arrowheads consistently lead away from equilibrium. Thus, allhough equilibrium can be 
attained if it happens that y(0) = y,, the more usual case of y(O)} # y, Will result in y being 
ever-increasing [if v(O) > jy] or ever-decreasing (if (0) < y,]. Besides, in this case the 
deviation of y from y, tends to grow al an increasing pace because, as we follow the 
arrowheads on the phase line, we deviate farther from the y axis, thereby encountering ever- 
increasing numerical values of dy/d? as well. The time path y(t) implied by phase line 4 
can therefore be represented by the curves shown in Fig. 15.4, where y is plotted against ¢ 
(rather than dy /d? against y). The equilibrium y, is dynamically unstable. 


+ However, not all intersections represent equilibrium positions. We shall see this when we discuss 
phase fine Cin Fig. 15.3. 


FIGURE 15.4 


Chapter 15 Continuous Time: First-Order Differential Equations 497 


¥O) wh) x) 


Ye 








(a) ib) (e) 


In contrast, phase line B implics a stable equitibrium at yp. If (0) = Ys, equilibrium 
prevails at once. But the important feature of phase line 8 is that, even if y(0} # Ys, the 
movement along the phase line will guide y toward the level of y,, The time path p(t) cor- 
responding to this type of phase line should therefore be of the form shown in Fig. 15.48, 
which is reminiscent of the dynamic market model. 

The preceding discussion suggests that, in general, it is the slope of the phase line at its 
intersection point which holds the key 10 the dynamic stability of equilibrium or the con- 
vergence of the time path. A (finite) positive slope, such as at point y,, makes for dynamic 
instability; whereas a (finite) negative slope, such as at y,, implies dynamic stability. 

This generalization can help us to draw qualitative inferences about given differential 
equations without even plotting their phase lines. Take the linear differential equation in 
(15.4), for instance: 





dy dy 
oY Lapa = ay +b 
7 +ay=b or 77 ayt+ 
Since the phase line will obviously have the (constant) slope —a, here assumed nonzero, 
we may immediately infer (without drawing the line) that 


converges to 


> , vay 
a20 © p(t) | diverges from | ei 
As we may expect, this result coincides perfectly with what the quantitative solution of this 


equation tells us: 
b b 
yA) = [xo - 7] e4— [from (15.59] 
a a 


We have learned that, starting from a nonequilibrium position, the convergence of y(t) 
hinges on the prospect that e% — 0 as ¢ > oo. This can happen if and only if a > 0; if 
a <0, then e~% — 00 as > 00, and (ft) cannot converge. Thus, our conclusion is one 
and the same, whether it is arrived at quantitatively or qualitatively. 

It remains to discuss phase linc C, which, being a closed loop sitting across the hori- 
zontal axis, does not qualify as a function but shows instead a relation between dy /d? and 
vy. The interesting new element that emerges in this case is the possibility of a periodically 
fluctuating time path. The way that phase line C is drawn, we shall find y fluctuating 
between the two values y, and y’ in a perpetual motion. In order to gencrate the periodic 





t This can arise from a second-degree differential equation (dy/dt)? = f(y). 


498 Part Five Dynamic Analvyis 


fluctuation, the loop must, of course, straddle the horizontal axis in such ¢ manner that 
dy/dt can alternately be positive and negative. Besides, at the two intersection points y, 
and y!, the phase line shoutd have an infinite slope; otherwise the intersection will resem- 
ble either y,, or y,, neither of which permits a continual flow of arrowhcads. The type of 
time path y(t) corresponding to this looped phase line is illustrated in Fig. 15.4c. Note that, 
whenever y(f) hits the upper bound 3. or the lower bound y,. we have dy/dt = 0 (local 
extrema); but these valucs certainly do not represent equilibrium values of y. In terms 
of Fig. 15.3, this means that not all intersections between a phasc line and the y axis are 
equilibrium positions. 

In sum, for the study of the dynamic stability of equilibrium (or the convergence of the 
time path), one has the alternative either of finding the time path itself or else of simply 
drawing the inference from its phase line, We shall illustrate the application of the !atter 
approach with the Solow growth model. Henceforth, we shall denote the intertemporal 
cquilibrium value of y by y, as distinct from y*. 





EXERCISE 15.6 
1. Plot the phase line for each of the following, and discuss its qualitative implications: 
dy dy 4_¥ 
@ yar? (0 G=4-35 
dy dy _ 
() B= 1-5y (Q) y= 8¥-N 


2. Plot the phase line for each of the following and interpret: 
d 
@ FaW+-16 (y=) 
dy 1 2 
by 2 = ay- > 
aay 28) 
3. Given dy/dt = (y — 3(y— 5) = y? — By + 15: 
(@) Deduce that there are two possible equilibrium levels of y, one at y= 3 and the 


other at y= 5. 
(b) Find the sign of a (2) at y = 3 and y = 5, respectively. What can you infer from 
these? dy \ at 


15.7 Solow Growth Model 


The growth model of Professor Robert Solow,’ a Nobel laureate, is purported to show, 
among other things, that the razor’s-edge growth path of the Domar model is primarily a 
result of the particular production-function assumption adopted therein and that, under 
alternative circumstances, the need for delicate balancing may not arise. 





The Framework 
In the Domar model, output is explicitly stated as a function of capital alone: « = pK (the 
productive capacity, or potential output, is a constant multiple of the stock of capital). The 


* Robert M. Solow, “A Contribution to the Theary of Economic Growth,” Quarterly journal of 
Economics, February 1956, pp. 65-94. 


Chapter 15 Conrinuons Tine: Pirsé-Order Differential Equations 499 


absence of a labor input in the production function carries the implication that labor is 
always combined with capital in a fixed proportion, so that it is feasible to consider explic- 
itly only one of these factors of production, Solow, in contrast, seeks to analyze the case 
where capital and labor can be combined in varying proportions. Thus his production 
function appears in the form 


O=f(K, 1) (KL > 0) 


where Q is output (net of depreciation), X is capital, and £ is labor—all being used in the 
macro scnse. It is assumed that fx and f; are positive {positive marginal products), and 
xx and f;,, are negative (diminishing returns to each input). Furthermore, the production 
function /‘is taken to be linearly homogeneous (constant returns to scale). Consequently, it 
is possible to write 


O=Lf (5 ' =ip(t) where k= * (15.25) 


In view of the assumed signs of fx and fx, the newly introduced ¢ function (which, be 
it noted, has only a single argument, 4) must be characterized by a positive first derivative 
and a negative second derivative. To verify this claim, we first recall from (12.49) that 


Sx = MPP = @'(K) 
hence fx > 0 automatically means 6(4) > 0. Then, since 
agi(k) ak al 
b- 
“dk aK =o a 
the assumption fx <0 leads directly to the result 6“(&) < 0. Thus the @ function— 
which, according to (12.46), gives the APP; fot every capitalabor ratio—is one that 
increases with & at a decreasing rate. 
Given that Q depends on K and L, it is necessary now to stipulate how the latter two vari- 
ables themselves are determined. Solow’s assumptions are: 


fee = ae (y= [sec (12.48)] 





, dK 
K (= a) =sQ [constant proportion of Q is invested] (15.26) 





L _ db fat 
L L 


)- =A (A>) [labor force grows exponentially] (15.27) 


The symbol s represents a (constant) marginal propensity to save, and 4. a (constant) rate 
of growth of labor. Note the dynamic nature of these assumptions; they specify not how the 
levels of K and L are determined, but how their rates of change are. 

Equations (15.25) through (15.27} constitute a complete model. To solve this model, we 
shall first condense it into a single cquation in one variable. To begin with, substitute 
(15,25) into (15.26) to get 


K =sLo(ky (15.28) 
Since 4 = K/L, and K = k/, however, we can obtain another expression for K by differ- 
entialing the latter identity: 
K=LR+KL [product rule] 


‘ (18.29) 
=Li+kAL — [by (15.27) 


500 Part Five Dynamic Analysis 


When (15.29) is equated to (15.28) and the common factor Z eliminated, the result emerges 
that 


k= sd(k) - ak (15.30) 


This equation—a differential equation in the variable 4, with two parameters s and A—is 
the fundamental equation of the Solow growth model. 


A Qualitative-Graphic Analysis 

Because (15.30) is stated in a general-function form, no specific quantitative solution is 
available. Nevertheless, we can analyze it qualitatively. ‘To this end, we should plot a phase 
line, with & of the vertical axis and & on the horizontal, 

Since (15.30) contains two terms on the right, however, Ict us first plot these as two sepa- 
rate curves. The Aé term, a linear function of &, will obviously show up in Fig. 15.5a asa 
straight line, with a zero vertical intercept and a slope equal to 4. The sO(X) term, on the other 
hand, plots as a curve that increases al a decreasing rate, like (4), since sp(k) is merely a 
constant fraction of the #(&) curve. If we consider K to be an indispensable factor of produc- 
tion, we must start the s@() curve [rom the point of origin; this is because if K = 0 and thus 
k = 0, O must also be zero, as will be @{k) and s@{4). The way the curve is actually drawn 
also reflects the implicit assumption that there exists a set of & valucs for which s$(4) 
excceds AA, so that the two curves intersect at some positive value of &, namely &. 

Based upon these two curves, the value of. k for each value of & can be measured by the 
vertical distance between the two curves. Plotting the values of & against &, as in Fig. 15.55, 
will then yield the phase line we need. Note that, since the two curves in Fig, 15.5a inter- 
sect when the capital-labor ratio is &, the phase line in Fig. 15.5 must cross the horizontal 
axis at . This marks & as the intertemporal equilibrium capitallabor ratio. 

Inasmuch as the phase line has a negative slope at k, the equilibrium is readily identified 
asa stable one; given any (positive) initial valuc of , the dynamic movement of the model 


FIGURE 15.5 


Kk sl) - Ak 














(a) (b) 





Chapter 15 Continuous Lime: First-Order Differential Rquations $01 


toust lead us convergently to the equilibrium level X. The significant point is that once this, 
cquilibrium is attained—and thus the capital-labor ratio is (by definition) unvarying over 
time—capital must thereafter grow apace with labor, at the identical rate 4. This will imply. 
in turn, that net investment must grow at the rate 4 (see Exercise 15.7-2), Note, however, 
that the word must is used here not in the sense of requirement, but with the implication of 
automaticity, Thus, what the Solow model serves to show is that, given a rate of growth of 
labor A, the economy by itself, and without the delicate balancing a la Domar, can eventu- 
ally reach a state of steady growth in which investment will grow at the rate 4, the same as 
Xand L. Moreover, in order to satisfy (15.25), O must grow at the same rate as well because 
@() is a constant when the capital-labor ratio remains unvarying at the level £. Such a 
situation, in which the relevant variables ali grow al an identical rate, is called a steady 
state—a generalizalion of the concept of stationary state (in which the relevant variables 
all remain constant, or in other words all grow at the zero rate). 

Note that, in the preceding analysis, the production function is assumed for convenience 
to be invariant over lime. If the state of technology is allowed to improve, on the other hand, 
the production function will have to be duly modified. For instance, it may be written 
instead in the form 








dt 


where 7, some measure of technology, is an increasing function of time. Because of the in- 
creasing multiplicative term 7'(¢), a fixed amount of K and L will turn outa larger output at 
a future date than at present, In this event, the s@() curve in Fig, 15.5 will be subject to a 
secular upward shift, resulting in successively higher intersections with the A ray and 
also in larger values of &, With technological improvement, therefore, it will become 
possible, in a succession of steady states, to have a larger and larger amount of capital 
equipment available to cach representative worker in the cconomy, with a concomitant rise 
in productivily. 


O=T(NS/(K, L} (Ge) 


A Quantitative Illustration 
The preceding analysis had to be qualitative, owing to the presence of a general function 
@(&) in the model. But if we specify the production function to be a lincarly homogeneous 
Cobb-Douglas function, for instance, then a quantitative solution can be found as well. 

Let us write the production function as 

K\? 
aK ar(~) =e 
é (7) 
so that p(k) = &*, Then (15.30) becomes 
kash 2k oor kh + hk = sk 





which is a Bernoulli cquation in the variable & [seo (15.24)], with R= 4,7 =s, and 
m =a. Letting z = &'"*, we obtain its linearized version 


dz+[(l-a)az—-( ~a)s]dt=0 
dz 

or —4({l-a)Az =(1 -@)s 
dt 


—— 
a b 


502 Part Five Dinanic Analysis 


This is a linear differential cquation with a constant coefficient a and a constant term ), 
Thus, by formula (15.5'}, we have 





2) = [-0 


s ' 
ke (nai 
A 


The substitution of z = #'-* will then yield the final solution 


ge =| Ry S| (ears 4 x 
A 4 
where &(0) is the initial value of the capital-labor ratio A. 
This solution is what determines the time path of k. Recalling that (1 - a) and 4 are 
both positive. we sce that as ¢ > 96 the exponential expression will approach zero: 


consequently, 


ilu) 
s & 
Ba or ko () asi oo 


A 


Therefore, the capital-labor ratio will approach a constant as its equilibrium value. This 
equilibrium or stcady-state value, (s/4)"" ". varies dircetly with (he propensity to save s, 
and inversely with the rate of growth of labor 4. 





EXERCISE 15.7 


1. Divide (15.30) through by &, and interpret the resulting equation in terms of the 
growth rates of k, K, and L. 

2. Show that, if capital is growing at the rate 4 (that is, K = Ae*4), net investment / must 
also be growing at the rate 4. 

3. The original input variables of the Solow model are K and L, but the fundamental equa- 
tion (15.30) focuses on the capital-labor ratio & instead, What assumption(s) in the 
model is(are) responsible for (and make possible) this shift of focus? Explain. 

4, Draw a phase diagram for each of the following, and discuss the qualitative aspects of 
the time path y(t): 

(@) y=3-~y-Iny (B) y= elm(y 42) 


Chapter 





Higher-Order Differential 
Equations 


In Chap. 15, we discussed the methods of solving a first-order differential equation, one in 
which there appears no derivative (or differential) of orders higher than 1. At times, how- 
ever, the specification of a mode] may involve the second derivative or a derivative of an 
even higher order. We may, for instance, be given a function describing “the rate of change 
of the rate of change” of the income variable Y, say, 
wy 
de 
from which we are supposed to find the time path of Y. In this event, the given function con- 
stitutes a second-order differential equation, and the task of finding the time path Y(¢) is 
that of solving the second-order differential cquation, The present chapter is concerned 
with the methods of solution and the economic applications of such higher-order differen- 
tial equations, but we shall confine our discussion to the /inear case only. 
A simple variety of linear differential equations of order 1 is of the following form: 
dy d'ly 


3 dy 
Gn tegen bo tag tony = h (16.1) 


=ky 





or, in an alternative notation, 
VOD) ary + tan iy (tay =o (16.1) 


This equation is of order n, because the ath derivative (the first term on the left) is the high- 
est derivative present. It is Hinear, since all the derivatives, as well as the dependent variable 
y, appear only in the first degree, and moreover, no product term occurs in which y and any 
of its derivatives are multiplied together. You will note, in addition, that this differential 
equation is characterized by constant coefficients (the a’s) and a constant term (b). The con- 
stancy of the coefficients is an assumption we shall retain throughout this chapter. The 
constant term 6, on the other hand, is adopted here as a first approach; later, in Sec. 16.5, 
we shall drop it in favor of a variable term. 


$03 


504 Part Five Dynamic Analysis 


16.1 Second-Order Linear Differential Equations 
with Constant Coefficients and Constant Term 





Example 1 


For pedagogic reasons, let us first discuss the method of solution for the second-order case 
{a = 2). The relevant differential equation is then the simple one 


v(t) Fay) + aay = 6 (16.2) 


where a), a2, and > are all constants. If the term 4 is identically zero, we have a homoge- 
neous equation, but if b is a nonzero constant, the equation is nonhomogeneous. Our 
discussion will proceed on the assumption that (16.2) is nonhomogeneous; in solving the 
nonhomogencous version of (16.2), the solution of the homogencous version will emerge 
automatically as a by-product. 

In this connection, we recall a proposition introduced in Sec, 15.1 which is equally 
applicable here: If'y, is the complementary function, i.c., the general solution (containing 
arbitrary constants) of the reduced equation of (16.2) and if y, is the particular integral, i.., 
any particular solution (containing no arbitrary constants) of the complete equation (16.2), 
then p(t) = y, + yp will be the general solution of the complete equation. As was explained 
previously, the y, component provides us with the equilibrium value of the variable y in the 
intertemporal sense of the term, whereas the y, component reveals, for each point of time, 
the deviation of the time path y(f) from the equilibrium. 


The Particular Integral 
For the case of constant coefficients and constant term, the particular integral is relatively 
easy to find. Since the particular integral can be amy solution of (16.2), i.c., any value of y 
that satisfies this nonhomogeneous cquation, we should always try the simpiest possible 
type: namely, » = a constant. [fy = a constant, it follows that 

yO =r" =0 
so that (16.2) in effect becomes a,y = 4, with the solution y = 5/a. Thus, the desired par- 
ticular integral is 

b 

y= (case of a2 4 0) (16.3) 
a 


Since the process of finding the value of y, involves the condition y'(2) = 0, the rationale 
for considering that value as an intertemporal equilibrium becomes self-evident. 
Find the particular integral of the equation 

y+ -dy=—-10 


The relevant coefficients here are a) = --2 and b= --10. Therefore, the particular integral is 
Yp = ~10/{-2) =5. 


What if a, = 0- so that the expression b/ay is not defined’? Jn such a situation, since the 
constant solution for y, fails to work, we must try some nonconstant form of solution. Taking 
the simplest possibility, we may try y = Af. Since a2 = 0, the differential cquation is now 


v' Otay (Q=b 


Example 2 


Example 3 


Chapter 16 Higher-Order Differential Equations 505 


but if y = kt, which implies y‘(/) = & and y(t) = 0, this equation reduces to ak = 6. 
This determines the value of k as b/ay, thereby giving us the particular integral 
b 
Vp = ral (case of az = 0; a, # 0) (16.3) 
1 


Inasmuch as y, is in this case a nonconstant function of time, we shall regard it as a mov- 
ing equilibrium. 
Find the yp, of the equation y“(t)+y(()=—10. Here, we have a2 =0, a =1, and 
b= -10. Thus, by (16.3’), we can write 
yp = 108 
If it happens that a is also zero, then the solution form of » = &¢ will also break down, 
because the expression 6/ /a, will now be undefined. We ought, then, to try a solution of the 


form y = kf. With a = a) = 0, the differential equation now reduces to the extremely 
simple form 

yi") seb 
and if » = ki’, which implies v/(2) = 2k and y"(4) = 24, the differential equation can be 


written as 2k = ), Thus, we find & = 6/2. and the particular integral is 


6b 
Vp = 3 (case of @) = a = 0) (16.3”) 


The equilibrium represented by this particular integral is again a moving equilibrium. 


Find the yp of the equation y(t)=—10. Since the coefficients are a, =a; =0 and 
b= —10, formula (16,3”) is applicable. The desired answer is yp = -52, 


The Complementary Function 
The complementary function of (16.2) is defined to be the general solution of its reduced 
(homogencous) equation 





yD tary (t) + ay (16.4) 


This is why we stated that the solution of a homogeneous equation will always be a 
by-product in the process of solving a complete equation. 

Even though we have never tackled such an equation before, our experience with the 
complementary function of the first-order differential equations can supply us with a use- 
ful hint, From the solutions (15.3), (15.3’), (15.5), and (15.5°), it is clear that exponential 
expressions of the form Ae” figure very prominently in the complementary functions of 
first-order differential equations with constant coefficients. Then why not try a solution of 
the form » = 4e” in the second-order equation, too? 

If we adopt the trial solution y = 4e’’, we must also accept 


yit)srde™ and v"(t) =r? de’ 


506 Part Five Drnantic Anatysis 


as the derivatives of y. On the basis of these expressions for», y’(r), and y"(4), the reduced 
differential equation {16.4} can be transformed into 


Ae? + ayr +a) =0 (16.4’) 


As long as we choose those values of 4 and r that satisfy (16.4'), the trial solution » = Ae” 
should work, Since ¢”’ can never be zcro, we must either let A = 0 or see to it that  salis- 
fies the equation 


rtarta=0 (16,4”) 


Since the value of the (arbitrary) constant 4 is to be definitized by use of the initial condi- 
tions of the problem, however, we cannot simply set 4 = 0 at will. Therefore, it is essential 
to look for values of r that satist’'y (16.4”). 

Equation (16,4”) is known as the characteristic equation (or auxiliary equation) of the 
homogeneous equation (16.4), or of the complete equation (16.2). Because it is a quadratic 
equation in ¢, it yields two roots (solutions), referred to in the present context as character- 
istic routs, as follows:* 


{ 
at ya — day 
2 


These two roots bear a simple but interesting relationship to cach other, which can serve as 
a convenient means of checking our calculation: The sum of the two roots is always equal to 
—ay, and their product is always cqual to a2. The proof of this slatement is straightforward: 


(16.5) 


ritp= 


~ay +ya —4a.  -a) - /@ —4ar 2a, 





1 
2 2 2 (16.6) 
(-ay- (a — 4a) Aa, 


ry re ay re 


The valucs of these two roots are the only values we may assign to r in the solution 
y = de", But this means that, in effect, there are wo solutions which will work, namely, 


rth 


yA and yy = Age? 


where 4, and A; are two arbitrary constants, and r) and rz are the characteristic roots 
found from (16.5), Since we want only one gencral solution, however, there seems to be 
‘ong too many. Two alternatives are now open to us: (1) pick either y; of yp at random, or 
{2) combine them in some fashion. 

The first alternative, though simpler, is unacceptable. There is only one arbitrary con- 
stant in y; or ¥, but to qualify as a general solution of a second-order differential equation, 
the expression must contain ¢we arbitrary constants. This requirement stems ftom the fact 
that, in proceeding from a function y(f) to its second derivative y"{t), we “lose” two 
constants during the two rounds of differentiation; therefore, to revert from a second-order 
differential equation to the primitive function y(¢), two constants should be reinstated. 
That leaves us only the alternative of combining y; and Jz, 80 as to include both constants 


t Note that the quadratic equation (16.4") is in the normalized form; the coefficient of the r? termis 1. 
In applying formula (16.5) to find the characteristic roots of a differential equation, we must first 
make sure that the characteristic equation is indeed in the normalized form. 


Example 4 


Chapter 16 Higher-Order Differential Equations 507 


A, and 42. As it turns out, we can simply take their sw, v, + yz, as the general solution of 
(16.4), Let us demonstrate that, if y and Js, respectively, satisfy (16.4), then the sum 
(y+ y2) will also do so. If yy and y, arc indeed solutions of (16.4), then by substituting 
each of these into (16.4), we must find that the following two equations hold: 


JCD + ay) + any) = 0 
IHN + ayy) +eny, = 0 
By adding these equations, however, we find that 
DIG) +9200] + @ LH) +3300] + et +9) = 0 
a SE 


ED) = pte) 





Thus, like y; or po, the sum (1 + 2) satisfies the equation (16.4) as well. Accordingly, the 
general solution of the homogeneous equation (16.4) or the complementary function of the 
complete equation (16.2) can, in general, be written as y. = y1 + Ps. 

A more careful examination of the characteristic-root formula (16.5) indicates, however, 
that as far as the values of 7; and r2 are concerned, three possible cases can arise, some of 
which may necessitate a modification of our result vy. = y) + v2. 


Case J (distinct real roots) When a > 4a, the square root in (16,5) is a real number, 
and the two roots r and rz will take distinc? real values, because the square root is added to 
—a, for r), but subtracted from —a, for Fy. In this casc, we can indeed write 


weMty= Apel t+ Ave™ — (r, #r2) (16.7) 


Because the two roots are distinct, the two exponential expressions must be linearly inde- 
pendent (neither is a multiple of the other); consequently, 4; and 42 will always remain as 
separate entities and provide us with two constants, as required. 


Solve the differential equation 
y+ ¥'@— 2y = -10 


The particular integral of this equation has already been found to be yp = 5, in Example 1, 
Let us find the complementary function. Since the coefficients of the equation are a; = 1 
and a2 = —2, the characteristic roots are, by (16.5), 

pppoe ev TER W138 

= 2 eal 
(Check: ry +12 =-1 = -a@; rf2 = -2 = ap.) Since the roots are distinct real numbers, 
the complementary function is yc = Are! + Aze ®". Therefore, the general solution can be 
written as 


VO) = Yet ¥p = Ai + Are + 8 (16.8) 


In order to definitize the constants Ay and A», there is need now for two initial condi- 
tions. Let these canditions be y(0) = 12 and y‘(0) = —2. That is, when t = 0, y(£) and y(t) 
are, respectively, 12 and —2. Setting tf = 0 in (16.8), we find that 


yO) = Ay + Ag +5 


508 Part Five Dvaamic Analysis 


Differentiating (16.8) with respect to tand then setting t = 0 in the derivative, we find that 
yt) = Ave’—2Ane% =~ and ~—oy'(0) = Ay — 22 


To satisfy the two initial conditions, therefore, we must set y(0) = 12 and y'(0) = -2, 
which results in the following pair of simultaneous equations: 

A+ Ap=7 

A, —2A = -2 


with solutions 4) = 4 and Az = 3. Thus the definite solution of the differential equation is 
yd = 4e' + 3e 745 (16.8') 


As before, we can check the validity of this solution by differentiation. The first and 
second derivatives of (16.8') are 


y(Q=4et'-6e % and = y"(Q = 4e' + 120° 


When these are substituted into the given differential equation along with (16.8'), the result 
is an identity -10 = —10. Thus the solution is correct. As you can easily verify, (16.8') also 
satisfies both of the initial conditions, 


Case 2 (repeated real roots) When the coellicients in the differential equation are such 
that a; = dap, the square root in (16.5) will vanish, and the two characteristic roots take an 
identical valuc: 
( y=-3 
r=r =r) =—— 
L 2 2 
Such roots are known as repeated roots, or multiple (here, double) roots. 
[f we uttempt to write the complementary function as y, = y, + yp. the sum will in this 
case collapse into a single expression 





Ye = Aye + Age = (Aj + Agje™ = Ase” 


leaving us with only one constant. This is not sufficient to lead us from a second-order 
differential equation back to its primitive function. The only way out is to find another cli- 
gible component term. for the sum- a term which satisfies (16.4) and yet which is lincarly 
independent of the term A3e"', so as to preclude such “collapsing.” 

An expression that will satisfy these requirements is date”. Since the variable ¢ has 
entered into it multiplicatively, this component term is obviously linearly independent of 
the Aje" term; thus it will enable us to introduce another constant, 44. But does 44ze”’ 
qualify as a solution of (16.4)? If we try » = Agte, then, by the product rule, we can find 
its first and second derivatives to be 


U=atrtt+ Ag and oy" = 78 4 Age” 
y 


Substituting these expressions of y, y’, and y” into the left side of (16.4), we get the 
expression 


[(r72 +r) + ay(rt + 1) + ane] Age™ 


Example 5 


Chapter 16 Higher-Order Differential Equations $09 


Inasmuch as, in the present context, we have ar = 4a, andy = —a,/2, this last expression 
vanishes identically and thus is always equal to the right side of (16.4); this shows that 
A,te” docs indeed qualify as a solution. 

Hence, the complementary function of the double-root case can be written as 


ye = Aye" + Agte”™ (16.9) 


Solve the differential equation 
y"(O + 6y'() + 9y = 27 


Here, the coefficients are a; = 6 and az = 9; since a = 4az, the roots will be repeated, 
According to formula (16.5), we have r = —a,/2 = —3. Thus, in line with the result in 
(16.9), the complementary function may be written as 


Yo= Ase + Agte 


The general solution of the given differential equation is now also readily obtainable. 
Trying a constant solution for the particular integral, we get yp = 3. It follows that the 
general solution of the complete equation is 


WO = Ye + Yp = Age + Agte + 3 


The two arbitrary constants can again be definitized with two initial conditions. Suppose 
that the initial conditions are y(0) = 5 and y‘(0) = —5. By setting t = 0 in the preceding 
general salution, we should find y() = 5; that is, 


40) = Ay +3=5 


This yields A3 = 2. Next, by differentiating the general solution and then setting t = 0 and 
also A3 = 2, we must have y'(0} = —5. Thatis, 


y'(Q = -3A3e77! — 3Agte 34 Age 3 
and y'(0) = -6 + Aq = —5 
This yields Ay = 1. Thus we can finally write the definite solution of the given equation as 
yD = 2e* + te 4 3 





Case 3 (complex roots} There remains a third possibility regarding the relative magni- 
tude of the coefficients a, and @, namely, a? < 4a). When this eventuality occurs, formula 
(16.5) will involve the square root of a negative number, which cannot be handled before 
we are properly introduced to the concepts of anaginary and complex numbers. For the 
time being, therefore, we shall be content with the mere cataloging of this case and shall 
leave the full discussion of it to Secs. 16.2 and 16.3. 


The three cases cited can be illustrated by the three curves in Fig, 16.1, each of which 
represents a different version of the quadratic function f(r}=r? + ajr tay. As we 
learned earlier, when such a function is set equal to zero, the result is a quadratic equation 
f(r) =9, and to solve the latter equation is merely to “find the zeros of the quadratic 


function” Graphically, this means that the roots of the equation are to be found on the 


horizontal axis, where f(r) = 0. 
The position of the lowest curve in Fig. 16.1, is such that the curve intersects the hori- 
zontal axis twice; thus we can find two distinct roots r, and r2, both of which satisfy the 


510 Part Five Dynamic Anaivsis 


FIGURE 16.1 


fr) 
Complex roots 


Repeated real routs 


Distinct real roots 








quadratic equation f(r) = 0 and both of which, of course, are real-valued. Thus the lowest 
curve illustrates Case 1. Turning to the middle curve, we note that it meets (he horizontal 
axis only once, at r3. This latter is the only value of r that can satisfy the equation f=. 
Therefore, the middle curve illustrates Case 2. Last, we note that the top curve does not 
meet the horizontal axis at all, and there is thus no real-valued root to the equation 
f(r) = 0. While there exist no real roots in such a case, there are nevertheless two complex 
numbers that can satisfy the equation, as will be shawn in Sec. 16.2. 


The Dynamic Stability of Equilibrium 
For Cases 1 and 2, the condition for dynamic stability of equilibrium again depends on the 
algebraic signs of the characteristic roots. 

For Case |, the complementary function (16.7) consists of the two exponential expres- 
sions Aje"" and Ase”. The coefficients 4; and 4, are arbitrary constants; their values 
hinge on the initial conditions of the problem. Thus we can be sure of a dynamically stable 
equilibrium () -> 0 as ¢ > 00), regardless of what the initial conditions happen to be, if 
and only if the roots r; and r; are both negative. We emphasize the word both here, because 
the condition for dynamic stability docs mot permit even one of the roots to be positive or 
zero. If ¢, =2 and rz = —5, for instance, it might appear at first glance that the second 
root, being larger in absolute value, can outweigh the first. In actuality, however, it is the 
positive root that must eventually dominate, because as ¢ increases, e” will grow incteas- 
ingly larger, but e~™ will steadily dwindle away. 

For Case 2, with repeated roots, the complementary function (16.9) contains not only 
the familiar ¢”” expression, but also a multiplicative expression fe". For the former term to 
approach zero whatever the initial conditions may be, it is necessary-and-sufficient to have 
r < 0. But would that also ensure the vanishing of fe”'? As it turns out, the expression ze"! 
(or, more generally, :*e”’) possesses the same general type of time path as does e”! (” # 0). 
Thus the condition r < @ is indeed necessary-and-sufficient for the entire complemen- 
tary function to approach zero as ¢ > 00, yielding a dynamically stable intertemporal 
equilibrium. 


Chapter 16 /figher-Onder Differential Equations S11 





EXERCISE 16.1 
1, Find the particular integral of each equation: 
(a) y"(f) — 2y'() +5y = 2 (a) y"(Q + ay'()- y= 4 
OyO+yO=7 © ¥O=12 


(y"(O+3y=9 

2. Find the complementary function of each equation: 

(a) y"(t} + 3y'(t) - 4y = 12 ( y'O-ay'Ot y=3 
{b) y"(O + 6y'() + Sy = 10 (d) y"() + By'(O + 16y =0 

3. Find the general solution of each differential equation in Prob. 2, and then definitize 
the solution with the initial conditions y(Q) = 4 and y’(0) = 2. 

4. Are the intertemporal equilibriums found in Prob. 3 dynamically stable? 

5. Verify that the definite solution in Example 5 indeed (a) satisfies the two initial condi- 
tions and (6) has first and second derivatives that conform ta the given differential 
equation. 

6. Show that, as t + oo, the limit of te is zero if r < 0, but is infinite if r > 0. 


16.2__Complex Numbers and Circular Functions 





When the coefficients of a second-order linear differential equation, y(t) + ayy + 
ay = h, are such that a? < dap, the characteristic-root formula (16.5) would call for (ak- 
ing the square root of a negative number. Since the square of any positive or negative real 
number is invariably positive, whereas the square of zero is zero, only a nonnegative real 
number can ever yield a real-valued square root. Thus, if we confine our attention to the 
real number system, as we have so far, no characteristic roots are available lor this case 
(Case 3), This fact motivates us to consider numbers outside of the real-number system. 


Imaginary and Complex Numbers 

Conceptually, it is possible to define a number ¢ = /—T1, which when squared will equal 
—1. Because / is the square root of a negative number, il is obviously not real-valued: it is 
therefore referred to as an imaginary number. With it at out disposal, we may write a host 
of other imaginary numbers, such as /—9 = V9./—1 = 3¥ and /—2 = V2i. 

Extending its application a step further, we may construct yet another type of number 
ene that contains a read part as well as an imaginary part, such as (& ~ 7) and (3 + Si). 
Known as complex numbers, these can be represented generally in the form (h + v/), 
where A and v are two real numbers.‘ Of course, in case v = 0, the complex number will 
reduce to a real number, whereas if h = 0, it will become an imaginary number. Thus the 
set ofall real numbers (call it R) constitutes a subset of the set of all complex numbers (call 
it C), Similarly, the set ofall imaginary aumbers (call it |) also constitutes a subset of C. 
That is, Rc C, and Ic C. Furthermore, since the terms read and imaginary are mutually 
exclusive, the sets R and | must be disjoint; that is RO T= ©. 


+ We employ the symbols / (for horizontal) and v (for vertical) in the general complex-number 
Notation, because we shall presently plot the values of hand v, respectively, on the horizontal and 
vertical axes of a two-dimensional diagram. 


512 PartFive Dynentic Analysis 


FIGURE 16.2 


Imaginary 
axis 





Real 
axis 





A complex number (h + v7) can be represented graphically in what is called an Argand 
diagram, as iltustrated in Fig. 16.2. By plotting & horizontally on the real avis and v verti- 
cally on the imaginary axis, the number (4 + vi) can be specified by the point (t, v), which 
we have alternatively labeled C. The values of # and » arc algebraically signed, of course, 
so that if # < 0, the point C'will be to the left of the point of origin; similarly, a negative v 
will mean a location below the horizontal axis. 

Given the values of # and v, we can also calculate the length of the line OC by applying 
Pythagoras’s theorem, which states that the square of the hypotenuse of a right-angled 
triangle is the sum of the squares of the other two sides. Denoting the length of OC by & 
(for radius vector), we have 


R=Pte and Rav +e (16.10) 


where the square root is always taken to be positive, The value of R is sometimes called the 
absolute value, ot modulus, of the complex number (4 + vf). (Note that changing the signs 
of A and v will produce no effect on the absolute vaiuc of the complex number, R.) Like h 
and v, then, 2 is real-valued, but unlike these other values, R is always positive. We shall 
find the number R to be of great importance in the ensuing discussion. 





Complex Roots 

Mcanwhile, let us return to formula (16.5) and examine the case of complex characteristic 
roots. When the coefficients of a second-order differential equation are such that aj < 4a, 
the square-root expression in (16.5) can be written as 


J} — 4a = [40 -ajJ-l= y4e — afi 


Hence, if we adopt the shorthand 








pat and p= 
2 2 
the two roots can be denoted by a pair of conjugate complex numbers: 


ritgshtvi 


Example 1 


Chapter 16 Higher-Order Differential Equations 513 


These two complex roots are said to be “conjugate” because they always appear logether, 
one being the sum of # and wi, and the other being the difference between A and wi. Note 
that they share the same absolute value R. 


Find the roots of the characteristic equation r? + r+ 4 = 0. Applying the familiar formula, 
we have 
=1AVATS _ -1eVISVAT _ =1, V5, 
2 2 = yea! 
which constitute a pair of conjugate complex numbers. 
As before, we can use (16.6) to check our calculations. If correct, we should have 
+2 = —@ (= ~1) and rr. = @ (= 4). Since we do find 


-1 VTi -1 V5 
nth= zyta}h 





ns 


2 2 


and ris 





our calculation is indeed validated. 


Even in the complex-root case (Case 3), we may express the complementary function of 
a differential equation according ta (16.7); that is, 


yo = Ac 4 Aye = ohare 4 Aye) (16.11) 


Butanew feature has been introduced: the number / now appears in the exponents of the two 
expressions in parentheses, How do we interpret such imaginary exponential functions? 

To facilitate their interpretation, it will prove helpful first to transform these expressions 
into equivalent circudar-function forms, As we shall presently see, the latter functions char- 
acteristically involve periodic fluctuations of a variable. Consequently, the complementary 
function (16.11), being translatable into circular-function forms, can also be expected to 
generate a cyclical type of time path. 


Circular Functions 
Consider a circle with its center at the point of origin and with a radius of length R, as 
shown in Fig. 16.3. Let the radius, like the hand of a clock, rotate in the counterclockwise 
direction. Starting from the position OA, it will gradually move into the position OP, fol- 
lowed successively by such positions as OB, OC, and OD. and at the end of a cycle, it will 
return to Od. Thereafter, the cycle will simply repeat itself. 

When ina specific position say, OP—the clock hand will make a definite angle @ with 
line OA, and the tip of the hand (?) will determine a vertical distance v and a horizontal dis- 
tance h, As the angle @ changes during the process of rotation, v and A will vary, although 


514 Part Five Dynamic Analysis 


FIGURE 16,3 


Quadrant IL Quadrant 1 
B v> gd 
h>od 









Quadrant I 


va 
hea 


Quadrant iV 


DB ved 
hoo 





R will not, Thus the ratios v/R and #/R must change with 6; that is, these two ratios arc 
both functions of the angle 6. Specifically, v/R and A/R are called, respectively, the sine 
(function) of @ and the cosine (function) of @: 


sin? = (16.12) 


cosé = (16.13) 


pol P| 


In view of their connection with a circle, these functions are referred to as circular func- 
tions. Since they are also associated with a triangle, however, they are alternatively called 
wigonometric functions, Another (and fancier) name lor them is sinusoidal functions, The 
sine and cosine functions are not the only circular functions; another frequently encoun- 
tered one is the fangent function, defined as 

sind ou 


tan@ = —— =— 


cosd A Heo) 


Our major concern here, however, will be with the sine and cosine [unctions, 

The independent variable in a circular function is the angle 4, so the mapping involved 
here is from an angle to a ratio of two distances. Usually, angles are measured in degrees 
(for example, 30, 45, and 90°); in analytical work, however, it is more convenient to mea- 
sure angles in radians instead. ‘The advantage of the radian measure stems from the fact 
that, when @ is so measured, the derivatives of circular functions will come out in neater 
expressions -much as the basc ¢ gives us neater derivatives for exponential and logarith- 
mic functions. But just how much is a radian? To explain this, let us return to Fig. 16.3, 
where we have drawn the point P so that the length of the arc AP is exactly equal to the 
radius R. A radian {abbreviated as rad) can then be defined as the size of the angle ¢ 


TABLE 16,1 


Chapter 16 Higher-Order Differential Equations 515 


(in Fig. 16.3) formed by such an R-length arc. Since the circumference of the circle has 
a total length of 27 R (where x = 3.14159...) a complete circle must involve an angle 
of 27 rad altogether. In terms of degrees, however, a complete circle makes an angle 
of 360°; thus. by equating 360° to 27 rad, we can arrive at the following conversion 
table: 


Degrees 





Radians 


Properties of the Sine and Cosine Functions 

Given the length of 2, the value of sin@ hinges upon the way the value of » changes in re- 
sponse to changes in the angle 4. In the starting position OA, we have v = 0, As the clock 
hand moves counterclockwise, p starts to assume an increasing positive value, culminating 
in the maximum valuc of » = R when the hand coincides with OB, that is, when @ = 
mx /2 rad (= 90°). Further movement will gradually shorten v, until its value becomes zero 
when the hand is in the position OC, i,¢,, when @ = 7 rad (= 180°). As the hand enters the 
third quadrant, v begins to assume negative values; in the position OP, we have v = —R. 
In the fourth quadrant. 1 is still negative, but it will increase from the value of — 2 toward 
the value of » = 0, which is attained when the hand returns to O4—that is, when 0 = 
2 rad (= 360°). The cycle then repeats itsclf. 

When these illustrative values of v arc substituted into (16.12), we can obtain the results 
shown in the “sin @” row of Table 16.1. For a more complete description of the sine func- 
tion, however, see the graph in Fig. 16.4@, where the values of sin @ are plotted against thos¢ 
of @ (expressed in radians). 

The value of cos@, in contrast, depends instead upon the way that # changes in response 
to changes in 6. In the starling position Q4, we have & = R. Then h gradually shrinks, till 
Ah =0 when 6 = 7/2 (position OB). In the second quadrant, A turns negative, and when 
4 =x (position OC), h = —R. The value of f gradually increases from —& to zero in the 
third quadrant, and when @ = 35r/2 (position OD), we find that 4 = 0. In the fourth quad- 
tant, # turns positive again, and when the hand returns to position Od (# = 277), we again 
have # = R. The cycle then repeats itself. 

The substitution of these illustrative values of # into (16.13) yields the results in the 
bottom row of Table 16.1, but Fig. 16.46 gives a more complete depiction of the cosine 
function. 

The sing and cos@ functions share the same domain, namcly, the set of all real numbers 
(radian measures of @). In this connection, it may be pointed out that a negative angle 
simply refers to the reverse rotation of the clock hand, for instance, a clockwise movement 











516 Part Five Dynamic Analysis 


FIGURE 16.4 


sing 


+1 








cox B 











from O4 to OD in Fig. 16.3 generates an angle of —1/2 rad (= —90°). There is also a 
common range for the two functions, namely, the closed interval [—1, 1]. For this reason, 
the graphs of sin @ and cos@ arc, in Fig. 16.4, confined (o a definite horizontal band. 

A major distinguishing property of the sine and cosine functions is that both are peri- 
odic; their values will repcat themselves for every 27 rad (a complete circle) the angle @ 
travels through. Each function is therefore said to have a period of 277. In view of this 
periodicity feature, the following equations hold (for any integer #): 


sin(@ + 2nz) = sind cos(@ + 2nm) = cosé 


That is, adding (or subtracting) any integer multiple of 27 to any angle @ will affect neither 
the value of sin@ nor that of cos @. 

The graphs of the sine and cosine functions indicate a constant range of fluctuation in 
each period, namely, £1, This is sometimes alternatively described by saying that the 
amplitude of fluctuation is |. By virtue of the identical period and the identical amplitude, 
we sec that the cos 8 curve, if shifled rightward by 7/2, will be exactly coincident with the 
gin curve. These two curves are therefore said to differ only in phase, i.¢., to differ only 
in the location of the peak in each period. Symbolically, this fact may be stated by the 
equation 


‘x 
@=sin(@t+ > 
cos sin( +5) 


Example 2 


Example 3 


Chapter 16 Higher-Order Differential Fquarions 517 


The sine and cosine functions obey ccrtain identities. Among these, the more frequently 
used are 
sin(—@) = —sin@ 


16.14) 
cos(—@) = cos (16 ) 


sin? + cos? @ = 1 [where sin? 6 = (sin@)°, etc.] (16.15) 
sin(@, + 62) = sin ?; cos Oy + cos A sin A 


a (16.16) 
cos(#, + 42) = cos 4 cos 0) F sin A sin 


The pair of identities (16.14) serves to underscore the fact that the cosinc function is sym- 
metrical with respect to the vertical axis (that is, @ and —0 always yicld the same cosine 
value), while the sine function is not. Shown in (16.15) is the fact that, for any magnitude 
of @, the sum of the squarcs of its sine and cosine is always unity. And the set of identities 
in (16.16) gives the sine and cosine of the sum and difference of two angles 0 and @. 

Finally, a word about derivatives. Being continuous and smooth, both sin @ and cos are 
differentiable. The derivatives, d(sin@)/d@ and d(cos@)/d@, are obtainable by taking the 
limits, respectively, of the difference quotients A(sin @)/A@ and A(cos #)/A@ as AG > 0. 
The results, stated here without proof, are 


sind = 0050 (16.17) 


d . 
77 cos@ = — sing (16.18) 


It should be emphasized, however, that these derivative formulas are valid only when @ is 
measured in radians; if mcasured in degrees, for instance, (16.17) will become d(sin@)/ 
d6 = (7/180) cos@ instead. It is for the sake of getting rid of the factor (2/180) that radian 
measures are preferred to degree measures in analytical work. 


Find the slope of the sin@ curve at # =x/2. The slope of the sine curve is given by its 
derivative (= cos), Thus, at @ = 2/2, the slope should be cos (1/2) = 0. You may refer to 
Fig. 16.4 for verification of this result. 


Find the second derivative of sin #. From (16.17), we know that the first derivative of sind is 
cos@, therefore the desired second derivative is 


e sin d cost ing 
an == =—sin 
dee de . 


Euler Relations 

Tn See. 9.5, it was shown that any function which has finite, continuous derivatives up to the 
desired order can be expanded into a polynomial function. Morcover, if the remainder term 
R,, to the resulting Taylor series (expansion at any point x) or Maclaurin series (expansion 
al x = 0) happens to approach zero as the number of terms » becomes infinite, the poly- 
nomial may be written as an infinite series. We shall now expand the sinc and cosine func- 
tions and then altempt to show how the imaginary exponential expressions cncountered in 
(16.11) cart be transformed into circular functions having equivalent expansions. 


518 Part Five Dynamic Analvsis 


For the sine function, write 6(@) = sin@; it then follows that @(0) = sin? = 0. By 
successive derivation, we can get 








$(@) = cos (0) = cos = 1 
o"(0) = -sin@ (0) = -sind =0 
(0) = —cos8 (0) = —cos0 = -1 
$8) =sind [| ) (0) =sind=0 


$8) = cosd (0) = cos0 = 1 


When substituted into (9.14), where @ now replaces x, these will give us the following 
Maclaurin series with remainder: 


# a Herp) 
ind = -— yg DM get 
siné=0+040 gy totat + (tb)! 


Now, the expression p!"*+(p) in the last (remainder) term, which represents the (n + I)st 
derivative evaluated at @ = p, can only take the form of cos p or tsin p and, as such, can 
only take a value in the interval [—1, 1], regardless of how large # is. On the other hand, 
(n + 1)! will grow rapidly as 2 + co—in fact, much more rapidly than 6” as a increases. 
Hence, the remaindet term will approach zero as n — 00, and we can therefore express the 
Maclaurin scrics as an infinite series: 


. ee ¢ 
sn6 =F tym ate (16.19) 
Similarly, if we write y(@) = cos é@, then ¥(0) = cos = 1, and the successive deriva- 
tives will be 
= sing = -sin) =0 
} ne —cosé Wie —cos0 = — 
w'"(@) = sine w’"(0) = sind =0 
v(8) = cosd 4) = 0080 = | 
HG) = -sind v0) —sind = 0 
On the basis of these derivatives, we can expand cos @ as follows: 
a? at wet Dep) an 
cos8= 140-5) +04 77 +: “hogy? 


Since the remainder term will again tend toward zero as 7 > 0, the cosine function is also 
expressible as an infinite series, as follows: 
2 pt ge 
cosd=l-F tg at (16.20) 

You must have noticed that, with (16.19) and (16.20) at hand, we are now capable of 
constructing a table of sine and cosine values for all possible values of 6 (in radians). How- 
ever, our immediate interest lies in finding the relationship between imaginary exponential 
expressions and circular functions. To this end, let us now expand the two exponential 


Chapter 16 Higher-Order Differential Equations 519 


expressions ¢” and e~. The reader will recognize that hese are but special cases of the 
expression e*, which has previously been shown, in (10.6), to have the expansion 
1 1 1 
e* altxts a? tye +ar +: 
Letting x = i#, therefore, we can immediately obtain 
(er (ey (ay (48)8 
Palen Se eS + Gre Shs 


 & ia at Ges 
H1ti8- 5 


-()¢.% oe 
=U-gtgro He gtar 


Similarly, by setting x = —i6, the following result will cmerge: 

(ie? (-i0y) (= =i (-i8) 
2! + 3! + Al 5! + 

e i8 af G98 


sl +t gta 


1 e fa oO @ 
ae Tr ees ne Ce 


By substituting (16,19) and (16.20) into these two results, the following pair of identities— 
known as the Euler relations—can readily be established: 





eM =l-i+ 





e = cosd +i sind (16.21) 
eo = wos@ i sind (16,21') 


These will enable us to translate any imaginaty exponential function into an equivalent 
linear combination of sine and cosine functions, and vice versa. 


Find the value of e’”. First let us convert this expression into a trigonometric expression. By 


Example 4 ; ° “us a > 

————__ setting #@=7 in (16.21), it is found that e’ = cosa +isina. Since cos =—1 and 
sin = 0, it follows that e’7 = —1. 

Example 5 Show that e~*/2 = —i, Setting @ = 2/2 in (16.21'), we have 


gtr = cos isin =0-i(}=-7 


Alternative Representations of Complex Numbers 

So far, we have represented a pair of conjugate complex numbers in the general form 
(h £ vi). Since # and v refer to the abscissa and ordinate in the Cartesian coordinate sys- 
tem of an Argand diagram, the expression (h + vi) represents the Cartesian form of a pair 
of conjugate complex numbers. As a by-product of the discussion of circular functions and 
Euler relations, we can now express (f + vi) in two other ways. 


520 Part Five Dunamic Anatysis 


Example 6 


Example 7 


TABLE 16.2 


Referring to Fig. 16.2. we see that as soon as 4 and v are specified, the angle @ and the 
value of R also become determinate, Since a given 0 and a given R can together identify a 
unique point in the Argand diagram, we may employ 6 and # to specify the particular pair 
of complex numbers. By rewriting the definitions of the sine and cosine functions in 
(16.12) and (16.13) as 


v=Rsin@ and #= Reosd (16.22) 
the conjugate complex numbers (# + v/) can be transformed as follows: 
hdvi = Reosd t Ri sind = R(cos? Li sind) 


In so doing, we have in effect switched from the Cartesian coordinates of the complex 
numbers (# and v) to what are called their polar coordinates (R and @). The right-hand 
expression in the preceding equation, accordingly, exemplifies the polar fort of a pair of 
conjugate complex numbers. 

Furthermore, in view of the Euler relations, the polat form may also be rewritten {nto the 
exponential form as follows: R¢cosO £ i sinf) = Re'!. Hence, we have a total of three 
alternative representations of the conjugate complex numbers: 


Acxui = R(cos@ £isin§) = Re*" (16.23) 


If we are given the values of & and @, the transformation to # and ¥ is straightforward, 
we use the two equations in (16.22). What about the reverse transformation? With given 
values of # and v, no difficulty arises in finding the corresponding valuc of R, which is 
equal to Va? +. Buta slight ambiguity arises in regard to @: the desired valuc of # (in 
radians) is that which satisfies the two conditions cos@ = A/R and sing = v/ Ry but for 
given values of # and v, @ is not unique! (Why?) Fortunately, the problem is not serious. for 
by confining our attention to the interval [0, 2x) in the domain, the indeterminancy is 
quickly resolved. 





Find the Cartesian form of the complex number $e”*'?. Here we have R = S and = 32/2; 
hence, by (16.22) and Table 16.1, 


ne seos =0 and v= 5sin =-5 
The Cartesian form is thus simply # ~ vi = —5/. 


Find the polar and exponential forms of (1 + V3/). In this case, we have h = 1 and v = NER 
thus 8 = /1+3 =2. Table 16.1 is of no use in locating the value of @ this time, but 
Table 16.2, which lists some additional selected values of sin# and cosé, will help. Specifically, 








se Tr a 

8 3 4 3 

. 1 1 Jt V3 

sing 3 Fi (- 2) z 
6 B 1f_ 7 1 
cos 7 alr z 





Chapter 16 Higher-Order Differential Equeitions 521 


we aré seeking the value of @ such that cosé = A/R = 1/2 and sin@ =v/R = 3/2. The 
value @ = 2/3 meets the requirements. Thus, according to (16.23), the desired transforma- 
tion is 
14 Vv3i = 2(eoe§ +isin 4] = 2¢'03 

Before leaving this topic, let us note an important extension of the result in (16.23). 
Supposing that we have the nth power of a complex number—say, (h + vi)"—how do we 
write its polar and exponential forms? The exponential form is the easier to derive. Since 
hivi = Re" it follows that 

(he vi)! = (Rely? = preint 
Similarly, we can write 
(h—viy" = (Rey = prenin 

Note that the power 1 has brought about two changes: (1) R now becomes &”, and (2) 4 


now becomes n¢. When these two changes are inserted into the polar form in (16.23), we 
find that 


(ht vi)? = R"%(cos ne +i sinned) (16,23’) 
That is, 
[R(cos# = i sine)]" = R’(cosné +i sinné) 


Known as De Moivre’s theorem, this result indicates that, to raise a complex number to the 
nth power, one must simply modify its polar coordinates by raising & to the nth power and 
multiplying @ by n. 





EXERCISE 16.2 


|. Find the roots of the following quadratic equations: 
(@) 2-37 4950 (Q 2x4+%48=0 
(b) 24 2r417=0 (a) 27 -x4+1=0 
2. (@) How many degrees are there in a radian? 
{b) How many radians are there in a degree? 
3. With reference to Fig. 16.3, and by using Pythagoras’s theorem, prove that 
(a) sin? -+cos?@=4 — (b) sin : = cos = z 
4. By means of the identities (16.14), (16.15), and (16.16), show that: 
(a) sin 26 = 2siné cos 








(O Sin) + @2) + sin(@ — 82) = 2sin A cos 4 


1 
2a= 
(@)1+tan*¢= costa 


(2) sin G -#) =cosd (Ff) cos 5 -*) = sind 


5. By applying the chain rule: q d 
(a) Write out the derivative formulas for 7 sin f(8) and w cos f(@), where F(8) is a 
function of @. a 6 
(b) Find the derivatives of cos 63, sin(#? + 30), cose", and sin(1/4), 


522 Part Five Dynamic Analysis 


6. From the Euler relations, deduce that: 


(Qe*=-] (9 e4 = Ba +f) 
(b) e'7? = 3a +V3) ae = 2a +i) 
7. Find the Cartesian form of each complex number: 
(a) 2{ cos z +isin z) (b) 4e'"/3 (Q vZeini4 
8. Find the polar and exponential forms of the following complex numbers: 
@ ; + Bet (b) 4/3 +i) 


16.3 Analysis of the Complex-Root Case 





With the concepts of complex numbers and circular functions at our disposal, we are now 
prepared to approach the complex-root case (Case 3), referred to in Sec. 16.1. You will re- 
call that the classification of the three cases, according to the nature of the characteristic 
roots, is concerned only with the complementary function of a differential equation. Thus, 
we can continue to focus our attention on the reduced equation 


yy (Q+ayv'(}+ay=0 [reproduced from (16.4}] 


The Complementary Function 
When the values of the coefficients a; and a are such that a < 4a), the characteristic 
roots will be the pair of conjugate complex numbers 
KiPp Hho 
c huot @o ve ea 
where = a5 ani veoy 2 — ay 


The complementary function, as was already previewed, will thus be in the form 
ve = eM(Ajet + doe) [reproduced from (16.11)] 


Let us first transform the imaginary exponential expressions in the parentheses into 
equivalent trigonometric expressions, so that we may interpret the complementary function 
as a circular function. This may be accomplished by using the Euler relations. Letting 
@ = vt in (16.21) and (16,.21'), we find that 


e™ =cosvt+isinve and ¢ ™ =cosvi—/sinut 
From these, it follows that the complementary function in (16.11) can be rewritten as 
At 7 ; 
ve = e'[A\(cos vt +i sin vt) + Arcos ut — 7 sin vf}] 
as > (16.24) 
= e™[(Ay + Ap) cos vt + (Ay — A2)i sinus] 


FIGURE 16.5 


Chapter 16 Higher-Order Differential Equations 523 





Ar AB=6 


@ (radians) 








Furthermore, if we employ the shorthand symbols 
As= Aj 4+dy and Ag = (Ay — Ap)i 
it is possible to simplify (16.24) into? 
ye =el(Ascosut + Agsinvt) (16.24') 


where the new arbitrary constanis A; and Ag are later to be definitized. 

If you are meticulous, you may feel somewhat uneasy about the substitution of 4 by vz 
in the foregoing procedure, The variable # measures an angle, but vz is a magnitude in units 
oft (in our context, time). Therefore, how can we make the substitution @ = vt? The answer 
to this question can best be explained with reference to the unit circle (a circle with radius 
R= 1) in Fig. 16.5. True. we have been using 0 to designate an angle; but since the angle 
is measured in radian units, the value of 0 is always the ratio of the length of arc 48 to the 
radius &. When R = 1, we have specifically 


arcAB  atcAB 
R i] 











g= =arcAB 

In other words, @ is not only the radian measure of the angle, but also the length of the 
are 48, which is a number rather than an angle. If the passing of time is charted on the 
circumference of the unit circle (counterclockwise), rather than on a straight line as we do 
in plotting a time series, it really makes no difference whatsoever whether we consider the 


* The fact that in defining Ag, we include in it the imaginary number ‘is by no means an attempt to 
“sweep the dirt under the rug.” Because Ag is an arbitrary constant, it can take an imaginary as well 
as a real value, Nor is it true that, as defined, Ag will necessarily turn out to be imaginary. Actually, 
if Ay and Az are a pair of conjugate complex numbers, say, m+ ni, then As and Ag will both be 
real: As = Ay + Az =(m-+ ni) +(m-~ ni) = 2m, and Ag = (Ay — 42) = [(mt ni) —(m— ni] = 
(2ni)ji = —2n. 


524 Part five Dvnamic Analysis 


lapse of time as an increase in the radian measure of the angle @ or as a lengthening of the 
are 4B, Even if R 4 1, moreover, the same line of reasoning can apply, except that in that 
case @ will be cqual to (arc 4B)/R instead; i.e., the angle @ and the arc 4B will bear a fixed 
proportion to cach other, instead of being equal. Thus, the substitution 6 = vs is indeed 
legitimate. 


An Example of Solution 
Let us find the solution of the differential equation 
v(t) + 2y'(t) + ITy = 34 
with the initial conditions y(0) = 3 and y‘(0} = 11. 
Since a; = 2.¢. = 17, and’ = 34, we can immediately find the particular integral to be 
pants pycsa 
ey “ 
Moreover, since a = 4 < 4a) = 68, the characteristic roots will be the pair of conjugate 
complex numbers (# + vi), where 
] 1 1 
h=—-ja=-l and v=5 day ~ aj = 364 =A 


Hence, by (16.24'), the complementary function is 
Yo = & (As cos 4t + Ag sin 4s) 
Combining y, and y,, the general solution can be expressed as 
y(t) =e (As cosdt + de sin4r) + 2 
To definitize the constants As and Ag, we utilize the two initial conditions. First, by 
setting ¢ = 0 in the general solution, we find that 
y(0) = e845 0080+ Ag sin 0) +2 
=(A45+0)+2=A5+2 [cos 0 = 1; sind = 0] 
By the initial condition +(0) = 3, we can thus specify As = 1. Next, let us differentiate the 
general solution with respect to t—-using the product rule and the derivative formulas 
(16.17) and (16.18) while bearing in mind the chain rule [Exercise 16.2-5}—10 find y'(t) 
and then y'(0}: 
y(t) = —e'( As cose + Ae sin 42) + e'[As(—4 sin 41) + 4.45 cos 41] 
so that 
¥'(0) = —(As cos0 + Ag sin) + (4.45 sin + 445 cos 0) 
= (45 +0) 4+(0 +446) = 44g = A5 
By the second initial condition y'(0) = L1, and in view that 4s = 1, it then becomes clear 
that As = 3.” The definite solution is, therefore, 


y(t) =e (cos 4f + 3sin4t) +2 (16.25) 


t Note that, here, 4g indeed turns out to be a real number, even though we have included the 
imaginary number /in its definition. 


Chapter 16 Higher-Order Differential Equations 525 


As before, the y, component (= 2) can be interpreted as the intertemporal equilibrium 
level of y, whereas the y, component represents the deviation from equilibrium. Because of 
the presence of circular functions in y., the time path (16.25) may be expected to exhibit a 
fluctuating pattern, But what specific pattern will it involve? 


The Time Path 

We are familiar with the paths of a simple sine or cosine function, as shown in Fig. 16.4. 
Now we must study the paths of certain variants and combinations of sine and cosine func- 
tions so that we can interpret, in general, the complementary function (16.24') 


ve =e! (As cos ut + Ag sin ut) 


and, in particular, the y.. component of (16.25). 

Let us first examine the term (As cos vf). By itself, the expression (cos ut) is a circular 
function of (vs), with period 2 (= 6.2832) and amplitude 1. The period of 27 means that 
the graph will repeat ils configuration every time that (v/) increases by 22. When ¢ alone is 
taken as the independent variable, however, repetition will occur every time f increases by 
2m /v, so that with reference to ras is appropriate in dynamic economic analysis we 
shall consider the period of (cos vt) to be 27/v, (The amplitude, however, remains at [.) 
Now, when a multiplicative constant 45 is attached to (cos w7), it causes the range of 
fluctuation to change from +1 to £45. Thus the amplitude now becomes 4s, though 
the period is unaffected by this constant. In short, (As cos v7) is a cosine function of t, with 
period 27 /v and amplitude 45. By the same token, (44 sin vf) is a sine function of ¢, 
with period 2z/v and amplitude Ag. 

There being a common period, the sum {As cos vf + 4g sin vt) will also display a re- 
peating cycle every time ¢ increases by 27/v. To show this more rigorously, let us note that 
for given values of 45 and 4¢ we can always find two constants 4 and ¢, such that 





As = Acose and Ag =—Asine 
Thus we may express the said sum as 


As cosit + Ag sinut = Acosecosut — 4 sing sinul 
= A(cosut cose — sine sing) 
= Avos(ut +s) [by (16.16)] 


This is a modified cosine function of t, with amplitude 4 and period 27/1, because every 
time that ¢ increases by 277/u, (vt + €) will increase by 27, which will complete a cycle on 
the cosine curve. 

Had y. consisted only of the expression (As cos vt + Ag sin vé), the implication would 
have been that the time path of » would be a never-ending, constant-amplitude fluctuation 
around the cquilibrium value of y, as represented by y,. But there is, in fact, also the mul- 
tiplicative term e to consider. This latter term is of m jor importance, for, as we shal] see, 
it holds the key to the question of whether the time path will converge. 

If A > 0, the value of e” will increase continually as ¢ incr This will produce a 
magnifying effect on the amplitude of (As cos uf + dg sin vf) and cause cver-greater devi- 
ations from the equilibrium in cach successive cycle. As illustrated in Fig. 16.6a, the time 
path will in this case be characterized by explosive fluctuation. If h = 0, on the other hand, 











526 Part Five Dynamic Analysis 


FIGURE 16.6 





v(t) 


Equilibrium 
level 








{ah 











(c) 


then ¢” = 1, and the complementary function will simply be (45 cos vf + Ag sin vt), 
which has been shown to have a constant amplitude. In this second case, each cycle will 
display a uniform pattern of deviation from the equilibrium as illustrated by the time path 
in Fig. 16.66. This is a time path with wniform fluctuation. Last. if h < 0, the term e” will 
continually decrease as f increases, and each successive cycle will have a smaller amplitude 
than the preceding one, much as the way a ripple dies down. This case is illustrated in 
Fig. 16.6c, where the time path is characterized by damped fluctuation, The solution in 
(16.25), with A = —1, exemplifies this last casc. It should be clear that only the case of 
damped fluctuation can produce a convergent time path; in the other two cases, the time 
path is nonconvergent or divergent.’ 

{n all three diagrams of Fig. 16.6, the intertemporal equilibrium is assumed to be sta- 
tionary. Ifit is a moving one, the three types of ime path depicted will still fluctuate around 
it, but since a moving cquilibrium generally plots as a curve rather than a horizontal straight 


* We shall use the two words nonconvergent and divergent interchangeably, although the latter is 
more strictly applicable to the explosive than to the uniform variety of nanconvergence 


Chapter 16 Higher-Order Differential Equations 527 


line, the fluctuation will take on the nature of, say, a seties of business cycles around a 
secular trond. 


The Dynamic Stability of Equilibrium 

The concept of convergence of the time path ofa variable is inextricably tied to the concept 
of dynamic stability of the intertemporal equilibrium of that variable. Specifically, the equi- 
librium is dynamically stable if, and only if, the time path is convergent. The condition for 
convergence of the p{t) path, namely, A < 0 (Fig. 16.6c), is therefore also the condition 
for dynamic stabilily of the intertemporal equilibrium of y. 

You will recall that, for Cases | and 2 where the characicristic roots are real, the condi- 
tion for dynamic stability of equilibrium is that every characteristic root be negative. [n the 
present case (Case 3), with complex roots, the condition seems to be more specialized; it 
stipulates only that the real part (2) of the complex roots (A + vi) be negative. However, it 
is possible to unify all three cases and consolidate the seemingly different conditions into a 
single, generally applicable one. Just interpret any real root r as a complex root whose 
imaginary part is zero (uv = 0). Then the condition “the reaf part of every characteristic 
root be negative” clearly becomes applicable to all three cases and emerges as the only 
condition we need, 














EXERCISE 16.3 


Find the yp and the yc, the general solution, and the definite solution of each of the 
fallowing: 


» VO — Ay + By = 0; 0) = 3, y'(0) = 7 

"(D+ Ay) + 8y = 2; (0) = 24, ¥'(0) = 4 

«(+ By) ~4y = 12; (0) = 2, y'(0) = 2 

-¥"(D ~ 2y'() ~ 10y = 5; 40) = 6, y'(0) = 83 

VO + Sy = 3; KO) = 1, y'Q) =3 

. 2y"(t) = 12y'() + 20y = 40; (0) = 4, y'(0) = 5 

. Which of the differential equations in Probs. 1 to 6 yield time paths with (a) damped 
fluctuation; (b) uniform fiuctuation; (¢) explosive fluctuation? 


NAW PWN OH 


16.4__A Market Model with Price Expectations 





In the earlier formulation of the dynamic market model, both Q, and Q, are taken to be 
functions of the current price P alone. But sometimes buyers and seilers may base their 
market behavior not only on the current price but also on the price srend prevailing at the 
time, for the price trend is likely to lead them to certain expectations regarding the price 
level in the future, and these expectations can, in turn, influence their demand and supply 
decisions, 


Price Trend and Price Expectations 
In the continuous-time context. the price-trend information is to be found primarily in the 
two derivatives dP/dt (whether price is rising} and d?P/d¢? (whether increasing at an 


528 Part Five Dynamic Analysis 


increasing rate). To take the price trend into account, let us now include these derivatives as 
additional arguments in the demand and supply functions: 


02 = D[P(), PUD, P')] 
Qs = SLP(1), P(t), PUN 


If we confine ourselves to the linear version of thes¢ functions and simplify the notation for 
the independent variables to P, P’, and P”, we can write 


O,=0—BP+mP'+nP" (a, B > 0) 
O,=-yt6PtuP’+wP" — (y,6 > 0) 


where the parameters a, 8, y, and 6 are merely carryovers from the previous market 
models, but m, n, u, and w are new. 

The four new parameters, whose signs have not been restricted, embody the buyers’ and 
sellers’ price expectations, [fm > 0, for instance, a rising price will cause Q, to increase. 
This would suggest that buyers expect the rising price to continue to rise and, hence, prefer 
to inctease their purchases now, when the price is still relatively low. The opposite sign for 
m would, on the other hand, signify the expectation of a prompt reversal of the price trend, 
so the buyers would prefer to cut back current purchases and wait for a lower price to ma- 
terialize later. The inclusion of the parameter # makes the buyers’ behavior depend also on 
the rate of change of dP /d1. Thus the new parameters m and n inject a substantial element 
of price speculation into the model. ‘The parameters « and w carry a similar implication on 
the sellers’ side of the picture. 


A Simplified Model 

For simplicity, we shall assume that only the demand function contains price expectations. 
Specifically, we let # and # be nonzero, but let z = w = 0 in (16.26). Further assume that 
the market is cleared at every point of time. Then we may equate the demand and supply 
functions to obtain (after normalizing) the differential equation 


(16.26) 











8 a 
pry Mp B48 _ vty (16.27) 
a n a 
This equation is in the form of (16.2) with the following substitutions: 
m +5 
y=P a= wa potty 
A n Hn 


Since this pattern of change of P involves the second derivative P” as well as the first 
derivative P’, the present model is certainly distinct from the dynamic market model 
presented in Sec. 15.2, 

Note, however, that the present model differs from the previous model in yet another 
way. In Sec. 15.2, a dynami¢ adjustment mechanism, dP/dt = j(Q4-— Qs) is present. 
Since that equation implies that dP/d: = 0 if and only if Qy = Q,, the intertemporal 
sense and the markct-clearing sense of equilibrium are coincident in that model. In con- 
trast, the present model assumes market clearance at every moment of time. Thus every 
price attained in the market is an equilibrium price in the market-clearing sense, although 
it may not qualify as the intertemporal equilibrium price. In other words, the two senses 
of equilibrium are now disparate. Note, also, that the adjustment mechanism dP/dt = 
J(Qu — Q,}, containing a derivative, is what makes the previous market model dynamic. 


Chapter 16 Migher-Order Differential Equations 529 


In the present model, with no adjustment mechanism, the dynamic nature of the modef 
emanates instead from the expectation terms m P’ and 2 P". 


The Time Path of Price 
The intertemporal cquilibrium price of this model- the particular integral P,, (lormerly 
¥p)—is easily found by using (16.3). It is 
_b _aty 
’ a pts 
Because this is a (positive) constant, it zepresents a stationary equilibrium. 
As for the complementary function P. (formerly y,), there are three possible cases. 


(F) 5") 


The complementary function of this casc is, by (16.7), 


P.= Aye! + Age™ 





Case 1 (distinct real reots) 





where 
1 2 6 
nines (——s (") +4(F*) (16.28) 
2 a A n 
Accordingly, the general solution is 
, ay OT 
P(t) = P+ Py = Apel + Ane + Fra (16.29) 
Case 2 (double real roots) 
2 
vi) 
eye) 
a a 
Tn this casc, the characteristic roots take the single value 
Mm 
f= 
2n 
thus, by (16.9), the general solution may be written as 
mt +¥ 
P(t) = Age "4 date mjan 5 16.29 
(1) @ + Aate Pra ( ,) 


Case3 (complex roots) 


2 
m\* 8 
—] <-4 bre 
n 7 

In this third and last case, the characteristic roots are the pair of conjugate complex 


numbers 


rrp thu 


530 Part Five Dynamic Analysis 


Example 1 


where 


m 
he d 
2n an 





Therefore, by (16.24’), we have the general solution 


P(t) = e™"(As 008 wt + Ae sin ve) +3 (16.29") 

A couple of general conclusions can be deduced from these results. First, ifn > 0, then 
—4(B + )/n must be negative and hence less than (m/n)*. Hence Cases 2 and 3 can im- 
mediately be ruled out. Moreover, with / positive (as are 8 and 6), the cxpression under 
the square-root sign in (16.28) necessarily excceds (m/n), and thus the square root must 
be greater than |z/a|. The £ sign in (16.28) would then produce one positive root (r;) and 
one negative root (rz). Consequently, the intertemporal equilibrium is dynamically unsta- 
ble, unless the definitized value of the constant 4, happens to be zero in (16.29). 

Second, if <0, then all three cases become feasible. Under Case 1, we can be sure 
that both roots will be negative if m is negative. (Why?) Laterestingly, the repeated root of 
Case 2 will also be negative if mis negative. Moreover, since A, the real part of the complex 
roots in Case 3, takes the same value as the repeated root r in Case 2, the negativity of » 
will also guarantee that 4 is negative. In short, for all three cases, the dynamic stability of 
equilibrium is ensured when the parameters m and # arc both negative. 


Let the demand and supply functions be 
Qy =42—4P —4P' + Pv 
= -6+8P 
with initial conditions P(0) = 6 and P’(0) = 4. Assuming market clearance at every point of 
time, find the time path P (2). 
In this example, the parameter values are 
w#=42 Be4 ys §=8 m=-4 n=] 

Since nis positive, our previous discussion suggests that only Case 1 can arise, and that the 


two (real) roots r; and r2 will take opposite signs. Substitution of the parameter values into 
(16.28) indeed confirms this, for 


1 
ryty= 3416 +48) = Kasay=6, 2 


The general solution is, then, by (16.29), 
P() = Aye" + Are "44 
By taking the initial conditions into account, moreover, we find that Ay = Ag = 1, so the 


definite solution is 


2t 


Pit)= pe ty4 


in view of the positive root m = 6, the intertemporal equilibrium (Pp = 4} is dynamically 
unstable, 


Example 2 


Chapter 16 Higher-Order Differential Kquations 531 


The preceding solution is found by use of formulas (16.28) and (16.29). Alternatively, we 
can first equate the given demand and supply functigns to obtain the differential equation 


Pp” —4P'—12P = —48 


and then solve this equation as a specific case of (16.2). 


Given the demand and supply functions 
Qy =40-—2P —2P'— p” 
Q,=-543P 


with P(0)=12 and P(0) = 1, find Pt) on the assumption that the market is always 
cleared. 

Here the parameters m and n are both negative. According to our previous general dis- 
Cussion, therefore, the intertemporal equilibrium should be dynamically stable. To find the 
specific solution, we may first equate Qyg and Q; to obtain the differential equation (after 
multiplying through by —1) 


PY + 2P'4+5P =45 
The intertemporal equilibrium is given by the particular integral 


45 


Pras 


9 


From the characteristic equation of the differential equation, 
P42r4+5=0 


we find that the roots are complex: 
1 1 
My fa= 5(-2+ V4 = 20) = g-2t4i) =-1£2F 


This means that A = —1 and v = 2, so the general solution is 
P(t) = e'(As cos2t + Ag sin2t) +9 


To definitize the arbitrary constants As and Ag, we set t = 0 in the general solution, to 
get 


P(0) = e°(As cos0 + AgsinO} +9 = As +9 [cos = 1; sind = 0) 
Moreover, by differentiating the general solution and then setting t = 0, we find that 
P't) = —e (As cos 2t + Ag sin 2t) + 2 (—-2As sin 2t + 2Ag cos 2) 
[product rule and chain rule] 
and P*(0) = —€°(As cos0 + Ag si. 0) + e°(—2As sin 0 + 2A cos 0) 
= —(As +0) + (0+ 246) = —As + 25 


Thus, by virtue of the initial conditions P(0)=12 and P’(0)=1, we have A; =3 and 
As = 2. Consequently, the definite solution is 


P(t) =e ‘(3cos2t+ 2sin2t) +9 


532 Part Five Dynamic daalysis 


This time path is obviously one with periodic fluctuation; the period is 2x /v = x. Thatis, 
there Is a complete cycle every time that t increases by v = 3.14159... In view of the 
multiplicative term e~', the fluctuation is damped. The time path, which starts from the 
initial price P(0) = 12, converges to the intertemporal equilibrium price Py = 9 in a cyclical 
fashion. 





EXERCISE 16.4 


1. Let the parameters m, n, u, and w in (16.26) be all nonzero. 
(a) Assuming market clearance at every point of time, write the new differential 
equation of the model. 
{b) Find the intertemporal equilibrium price. 
(0) Under what circumstances can periodic fluctuation be ruled out? 
2. Let the demand and supply functions be as in (16.26), but with u = w= 0 as in the 
text discussion. 
(a) If the market is not always cleared, but adjusts according to 
T= K-99 > 
write the appropriate new differential equation. 
(b) Find the intertemporal equilibrium price ? and the market-clearing equilibrium 
price P*. 
(@) State the condition for having a fluctuating price path. Can fluctuation occur if 
n> 0? 
3. Let the demand and supply be 


Qe 9 PtP $3”  Qpanl+4P—Po+5P" 


with P(O) = 4 and P'(0)=4. 
(a) Find the price path, assuming market clearance at every point of time. 
(b) 1s the time path convergent? With fluctuation? 


16.5 The Interaction of Inflation and Unemployment 





In this section, we illustrate the use of a second-order differential equation with a macro 
model dealing with the problem of inflation and unemptoyment. 


The Phillips Relation 

One of the most widely used concepts in the modern analysis of the problem of inflation 
and unemployment is the Phillips relation.” In its original formulation, this relation depicts 
an empirically based negative relation between the rate of growth of money wage and the 
Tate of uncmployment; 


w=fU) [f(y <0] (16.30) 


1 A.W. Phillips, “The Relationship Between Unemployment and the Rate of Change of Money Wage 
Rates in the United Kingdom, 1861-1957,” Economica, November 1958, pp. 283-299. 


Chapter 16 Higher-Order Differential Lquations 533 


where the lowercase letter w denotes the rate of growth of moncy wage IV (i.e. Ww = W/W) 
and Uis the rate of unemployment. It thus pertains only to the labor market, Later usage, 
however, has adapted the Phillips relation into a function that links the rate of inflation 
(instead of w) to the rate of unemployment. This adaptation may be justified by arguing that 
mark-up pricing is in wide use, so that a positive w, reflecting growing moncy-wage cost, 
would necessarily carry inflationary implications. And this makes the rate of inflation, like 
w, a function of U. The inflationary pressure of a positive w can, however, be offset by an 
increase in labor productivity, assumed to be exogeneous, and denoted here by 7. Specifi- 
cally, the inflationary effect cam materialize only to the extent that money wage grows faster 
than productivity. Denoting the rate of inflation—that is, the rate of growth of the price 
level P—by the lowercase letter p, (p = P/P), we may thus write 


p=w-T (16.31) 
Combining (16.30) and (16.31), and adopting the tinear yersion of the function /(U), we 
then get an adapted Phillips relation 
p=a-T—-pU (a, p>0) (16.32) 


The Expectations-Augmented Phillips Relation 
More recently, cconomists have preferred to use the expectations-augmented version of the 
Phillips relation 


w= f(U)+er  (O<g41) (16.30') 


where 7 denotes the expected rate of inflation. The underlying idea of (16.30'), as pro- 
pounded by the Nobel laureate Professor Friedman." is that if an inflationary trend has been 
in effect long enough, people arc apt to form certain inflation expectations which they then 
attempt to incorporate into their money-wage demands. Thus w should be an increasing 
function of w. Carried over to (16.32), this idea results in the equation 


p=a-T—pU+gen @<g<=l) (16.33) 





With the introduction of a new variable to denote the expected rate of inflation, it 
becomes necessary to hypothesize how inflation expectations are specifically formed+ 
Here we adopt the adaptive expectations hypothesis 

da 
at 





=ite-m (<j) (16,34) 
Note that, rather than explain the absolute magnitude of 2, this equation describes instead 
its pattern of change over time. If the actual rate of inflation p turns out to exceed the 
expected rate 7, the latter, having now been proven to be too low, is revised upward 
(dx/dt > 0), Conversely, if p falls short of , then z is revised in the downward direction. 
In format, (16,34) closely resembles the adjustment mechanism dP/dt = j(Qu ~ Q;) of 


1 Milton Friedman, “The Role of Monetary Policy,” American Economic Review, March 1968, pp. 1-17. 
+ This is in contrast to Sec. 16.4, where price expectations were discussed without introducing a new 
variable to represent the expected price. As a result, the assumptions regarding the formation of 
expectations were oaly implicitly embedded in the parameters m,n, u, and win (16.26). 


534 Part Five Dynamic Analysi 


the market model. But here the driving force behind the adjustment is the discrepancy 
between the actual and expected ratcs of inflation, rather than Qy and Q,. 


The Feedback from Inflation to Unemployment 

It is possible to consider (16.33) and (16.34) as constituting a complete model. Since there 
are three variables in a two-equation system, however, one of the variables has to be taken 
as exogenous. If 7 and p are considered endogenous, for instance, then U must be treated 
as exogenous. A more satisfying alternative is to introduce a third equation to explain the 
variable U, so that the model will be richer in behavioral characteristics. More significantly, 
this will provide us with an opportunity to take into account the feedback effect of inflation 
on unemployment. Equation (16.33) tells us how LU affects p—largely from the supply side 
of the economy. But p surcly can affect CU’ in return. For example, the rate of inflation may 
influence the consumption-saving decisions of the public, hence also the aggregate demand 
for domestic production, and the latter will, in turn, affect the rate of unemployment, Even 
in the conduct of government policies of demand management, the rate of inffation can 
make a difference in their effectiveness, Depending on the rate of inflation, a given level of 
money expenditure (fiscal policy) could translate into varying levels of real expenditure, 
and similarly, a given rate of nominal-money expansion (monetary policy) could mean 
varying rates of real-money expansion. And these, in turn, would imply differing effects on 
output and unemployment. 

For simplicity, we shall only take into consideration the feedback through the conduct of 
monetary policy. Denoting the nominal money balance by M and its rate of growth by 
m= M/M, letus postulate that" 

j 
WU a-Hm—p) (b> 0) (16.35) 

dt 
Recalling (10.25), and applying it backward, we sce that the expression (# — p)} represents 
the rate of growth of real moncy: 

MP 
MP 
Thus (16.35) stipulates that dU /dt is negatively related to the rate of growth of real-money 
balance. Inasmuch as the variable p now enters into the determination of ¢l//dt, the model 
now contains a feedback from inflation to unemployment, 


m- p= =r — he =f aapy 


The Time Path of z 
Together, (16.33) through (16.35) constitute a closed model in the three variables x, p, and 
U. By eliminating two of the three variables, however, we can condense the model into a 
single differential cquation in a single variable. Suppose that we let that single variable be 
a. Then we may first substitute (16.33) into (16.34) to get 

dx 


& = j@-T—pU)- jg) (16.36) 
dt 


* In an earlier discussion, we denoted the money supply by Ms, to distinguish it from the demand tor 
money Mz. Here, we can simply use the unsubscripted letter M, since there is no fear of confusion, 


Chapter 16 Higher-Order Differential Equations 535 


Had this equation contained the expression dU/d? instead of U, we could have substituted 
(16.35) into (16,36) directly. But as (16.36) stands, we must first deliberately create a 
dU /dt term by differentiating (16.36) with respect to ¢, with the result 


ax dU dx 
—, =-jp— -j(1-g)— . 
de JB i i(l-g) 7 (16.37) 
Substitution of (16.35) into this then yields 
Pn, . . dr , 
no ipkm — jBkp — jf 3a (16.37') 
There is still a p variable lo be eliminated. To achieve that, we note that (16.34) implics 
1 dit 
=—— 16.38) 
P wn +5 ( ) 


Using this result in (16.37'}, and simplifying, we finally obtain the desired differential 
equation in the variable 7 alone: 
bx 
de 





dx, . 
+ (Bk + JQ 877 + UPR = [bk (1637") 
a 


al a 
The particular integral of this equation is simply 
b 


t=—=m 
’ aly 
Thus, in this model, the intertemporal cquilibrium value of the expected rate of inflation 
hinges exctusively on the rate of grawth of nominal money. 
For the complementary function, the two roots are, as before, 


| a 
ri, = i (- + Jay 4) (16.39) 


where, as may be noted from ( 16.37"), both a; and a) are positive. On a priori grounds, it 
is not possible to determine whether a? would exceed, equal, or be less than a2. Thus all 
three cases of characteristic roots—distinct real roots, repeated real roots. of complex 
roots—can conceivably arise. Whichever case prescnts itself, however, the intertemporal 
equilibrium will prove dynamically stable in the present model. This can be explained as 
follows: Suppose, first, that Case | prevails, with aj > 4a,. Then the square root in (16.39) 


yields a real number. Since @y is positive, ,/a? — 4a is necessarily less than a =a, It 
follows that r) is negative, as is r2, implying a dynamically stable equilibrium. What if 
a? = Aap (Case 2)? In that event, the square root is zero, so that, =r. = —a)/2 < 0. And 
the negativity of the repeated roots again implies dynamic stability, Finally. for Case 3, the 
real part of the complex roots is # = —a, /2. Since this has the same value as the repeated 
roots under Case 2, the identical conclusion regarding dynamic stability applies. 

Although we have only studied the time path of z, the model can certainly yield infor- 
mation on the other variables, too. To find the time path of, say, the U variable, we can 
either start off by condensing the model into a differential equation in U rather than x (see 
Exercise 16,5-2) or deduce the U path from the x path already found (see Example 1), 





536 Part Five Dynamic Analysis 


Example 1 


Let the three equations of the model take the specific forms 





p=h- 34x (16.40) 
dr 3 

dt = Gl) (16.41) 
dU 1 
Fan xln- (16.42) 


Then we have the parameter values # = 3, h=1, j = 3, and k = 4; thus, with reference to 


(16.37"), we find 
3 9 . 9 
a = Akt jI-= 5 & = jbk= and b= jpkm= gm 


The particular integral is b/ay = m. With a < Aap, the characteristic roots are complex: 


nmuil34 (2%)! 2 23))- 33, 

VR=Z\ ONG 5)o3(-3 544 

That is, b= -} and v= 3, Consequently, the general solution for the expected rate of 
inflation is 


a(t) =e t4 (4s cos zt } Agsin 7) +m (16.43) 


which depicts a time path with damped fluctuation around the equilibrium value m. 
From this, we can also deduce the time paths for the p and U variables. According to 
(16.41), p can be expressed in terms of x and da/dé by the equation 


a4 dn 
Po Sat 
The x path in the general solution (16.43) implies the derivative 
dz 3 a4 3 3 
ao ~4e (Ascos gt Agsin at 


3 3. 3 3 
ges (-74s sin zt + q46 cos 2) {product rule and chain rule] 


Using the solution (16.43) and its derivative, we thus have 


: 3 3 
pit) =e 3 (4s cos it As sin 2) +m (16.44) 


Like the expected rate of inflation 3, the actual rate of inflation p also has a fluctuating time 
path converging to the equilibrium value m, 

As for the U variable, (16.40) tells us that it can be expressed in terms of + and p as 
follows: 


1 1 
U= 3 - P+ ag 


By virtue of the solutions (16.43) and (16.44), therefore, we can write the time path of the 
rate of unemployment as 


A sey 3 _ 3 a 
ub= 3¢ (As — Ascot + (As + Aan gt| + is (16.45) 


Chapter 16 Higher-Order Differential Equations 537 


This path is, again, one with damped fluctuation, with & as U, the dynamically stable 
intertemporal equilibrium value of U. 

Because the intertemporal equilibrium values of x and p are both equal to the monetary- 
policy parameter m, the value of m—the rate of growth of nominal money—provides the 
axis around which the time paths of x and p fluctuate. If a change occurs in m, a new equi- 
librium value of 7 and p will immediately replace the old one, and whatever values the z 
and p variables happen to take at the moment of the monetary-policy change will become 
the initial values from which the new x and p paths emanate. 

In contrast, the intertemporal equilibrium value U does not depend on m. According to 
(16.45), U converges to the constant a regardless of the rate of growth of nominal money, 
and hence regardless of the equilibrium rate of inflation. This constant equilibrium value of 
Uis referred to as the natural rate of unemployment. The fact that the natural rate of unem- 
ployment is consistent with any equilibrium rate of inflation can be represented in the Up 
space by a vertical straight line parallel to the p axis, That vertical line relating the equilib- 
rium values of U and p to each other, is known as the /ong-run Phillips curve, The vertical 
shape of this curve, however, is contingent upon a special parameter value assumed in this 
example. When that value is altered, as in Exercise 16.5-4, the long-run Phillips curve may 
no longer be vertical. 





EXERCISE 16.5 


1. tn the inflation-unemployment model, retain (16.33) and (16.34) but delete (16.35) 
and let U be exogenous instead. 

(a) What kind of differential equation will now arise? 

(b} How many characteristic roots can you obtain? Is it possible now to have periodic 
fluctuation in the complementary function? 

2. In the text discussion, we condensed the inflation-unemployment model into a differ- 
ential equation in the variable x. Show that the model can alternatively be condensed 
into a second-order differential equation in the variable U, with the same a and a 
coefficients as in {16.37"), but a different constant term b = kj[w - T — (1 — gm]. 

3, Let the adaptive expectations hypothesis (16.34) be replaced by the so-called perfect 
foresight hypothesis x = p, but retain (16.33) and (16,35), 

(@) Derive a differential equation in the variable p, 

{b) Derive a differential equation in the variable U. 

(c) How do these equations differ fundamentally from the one we obtained under the 
adaptive expectations hypothesis? 

(d) What change in parameter restrictian is now necessary to make the new differen- 
tial equations meaningful? 

4. In Example 1, retain (16.41) and (16.42) but replace (16.40) by 

1 1 

p= e7 BU + 37 

(a) Find p(), x(t), and U(t). 

(b) Are the time paths still fluctuating? Still convergent? 

(Q) What are B and U, the intertemporal equilibrium values of p and U? 

(d) Is i still true that U is functionally unrelated to p? If we now fink these two equilib- 
rium values to each other in a long-run Phillips curve, can we still get a vertical 
curve? What assumption in Example 1 is thus crucial for deriving a vertical fong-run 
Phillips curve? 


538. Part Five Dynamle Analysis 


16.6 Differential Equations with a Variable Term 





Example 1 


In the differential equations considered in Scc. 16.1, 

ph tay U) tary =b 
the right-hand term 4 is a constant. What if, instead of 5, we have on the right a variable 
term: ie., same function of t such as 67, e”, or 6 sint? The answer is that we must then 
modify our particular integral y,. Fortunately, the complementary function is not allected 


by the presence of a variable term, because y, deals only with the reduced equation, whose 
right side is always zero, 


Method of Undetermined Coefficients 


We shall explain a method of finding y,, known as the method of undetermined coefficients, 
which is applicable to constant-coefficient variable-term differential cquations, as long as 
the variable term and its successive derivatives together contain only a finite number of 
distinct types of expression (apart from multiplicative constants). The explanation of this 
method can best be carried out with a concrete illustration. 


Find the particular integral of 
y"@)+5y (0+ 3y=6P—-t-1 (16.46) 


By definition, the particular integral is a value of y satisfying the given equation, ‘.e., a value 
of y that will make the left side identically equal to the right side regardless of the value of 
t. Since the left side contains the function y(t) and the derivatives y'(4) and y”(t)—whereas 
the right side contains multiples of the expressions t2, t, and a constant—we ask: What gen- 
eral function form of y(f), along with its first and second derivatives, will give us the three 
types of expression ¢?, f, and a constant? The obvious answer is a function of the form 
Bit + Bot + B3 (where 8; are coefficients yet to be determined), for if we write the partic- 
ular integral as 
y(Q) = By? + Bott Bs 
we can derive 
y(Q}=2Bit+8, and = y()= 28; (16.47) 
and these three equations are indeed composed of the said types of expression. Substitut- 
ing these into (16.46) and collecting terms, we get 
Left side = (381)? + (108; + 3Bp)¢ + (2B) + 582 + 383) 


And when this is equated term by term to the right side, we can determine the coefficients 
B; as follows: 
3B) =6 
108, +382 =-1 > 
2B) + 582+383=-1 





Thus the desired particular integral can be written as 


Yp = 20 —7t+ 10 


Example 2 


Example 3 


Chapter 16 Higher-Order Differential Equations 539 


This method can work only when the number of expression types is finite. (Sec Excr- 
cise 16.6-1.) In general, when this prerequisite is met, the particular integral may be taken 
as being in the form of a linear combination of all the distinct expression types contained 
in the given variable term, as well as in all its derivatives. Note, in particular, that a constant 
expression should be included in the particular integral, if the original variable term or any 
of its successive derivatives contains a constant term. 


Asa further illustration, let us find the general form for the particular integral suitable for the 
variable term (bsin £). Repeated differentiation yields, in this case, the successive derivatives 
(bcos t), (—bsin t), (bcos t), (bsin £), etc., which involve only two distinct types of expres- 
sion. We may therefore try a particular integral of the form (8; sint-+ Bz cost). 


A Modification 
Tn certain cases, a complication arises in applying the method, When the coefficient of the 
y term in the given differential equation is zero, such as in 
yt) +5y(t) = 6 —F-1 

the previously used trial form for the y,, namely, £i/° + Bot + Ba, will fail to work. The 
cause of this failure is that, since the (7) term is out of the picture and since only deriva- 
tives y(t) and y(t) ag shown in (16.47) will be substituted into the left side, no Byt? term 
will ever appear on the left to be equated to the 6¢? term on the right. The way out of this 
kind of difficulty is to use instead the trial solution (2,2? + Bot + Bs); or if this too fails 
(e.g, given the equation y(t) = 61? —¢ — 1), to use 77(Byt? + Bot + By), and so on. 

Indeed, the same trick may be employed in yet another difficult circumstance, as is 
illustrated in Example 3. 





Find the particular integral of 
y"() + 3y(t) —4y = 2e* (16.48) 


Here, the variable term is in the form of e~*t, but all of its successive derivatives (namely, 
—8e' 322", 1282", etc.) take the same form as well. If we try the solution 


y()= Be" — [with y'(9 =—4Be* and y"(Q) = 16Be “] 
and substitute these into (16.48), we obtain the inauspicious resutt that 
Left side = (16 -12-4)Be *=0 (16.49) 


which obviously cannot be equated to the right-side term 2e~*. 
What causes this to happen is the fact that the exponential coefficient in the variable 
term (—4) happens to be equal to one of the roots of the characteristic equation of (16.48): 


rP43r—-4=0 (roots 4,72 =1, -4) 


The characteristic equation, it will be recalied, is obtained through a process of differentia- 
tion;} but the expression (16 — 12 — 4) in (16,49) is derived through the same process. Not 
surprisingly, therefore, (16 — 12 — 4) is merely a specific version of (r? + 3r — 4) with r set 
equal to —4, Since —4 happens to be a characteristic root, the quadrati¢ expression 


?P43r—-4=516-12-4 
must of necessity be identically zero. 


4 See the text discussion leading to (16.4"). 


540° Part Five Dynamic dnalysix 


To cope with this situation, let us try instead the solution 
y(t) = Bte* 
with derivatives 
y= -40Be* and oy") = (8 + 16) Be * 
Substituting these into (16.48) will now yield: left side = —58e-*. When this is equated to 


the right side, we determine the coefficient to be 8 = —2/5. Consequently, the desired par- 
ticular integral of (16.48) can be written as 
=2 at 


=—r 
Y= le 





EXERCISE 16.6 


1. Show that the method of undetermined coefficients is inapplicable to the differential 
equation y"() —ay'() +by=t!. 

2. Find the particular integral of each of the following equations by the method of unde- 
termined coefficients: 
(a) y"()+2y'Q+y=t OQ yOryW+2y=e 
(b) y+ 4y'(O ty = 20 (a) y"() + y'(O + 3y = sine 


16.7__ Higher-Order Linear Differential Equations 





The methods of solution introduced in the previous sections are readily extended to an 
nth-order lincar differential equation. With constant coefficients and a constant term, such 
an equation can be written generally as 


vO) bare NM be day (ay = (16.50) 


Finding the Solution 
In this case of constant cocfficients and constant term, the presence of the higher deriva- 
tives does not materially afleet the method of finding the particular integral discusse: 
earlier, 

If we try the simplest possible type of solution, » = &, we can see that all the derivatives 
from y(t) to p(t) will be zero; hence (16.50) will reduce to a,4 = 6, and we can write 





b 
yy Ska (dn £0} fet. (16.3)] 
an 
Incase @, = 0, however, we must try a solution of the form y = k/. Then, since v'(t) = &, 
all the higher derivatives will vanish, (16.50) can be reduced to a, | = 6, thereby yielding 
the particular integral 
b 


—1 (a = Ora, #0) — [of (16.3) 
an! 





yp=ht= 


If it happens that 2, = a,—1 = 0, then this last solution will fail, too; instead, a solution of 
the form y = Ar’ must be tried. Further adaptations of this procedure should be obvious. 


Chapter 16 Higher-Order Differential Equations 541 


As for the complementary function, inclusion of the higher-order derivatives in the dif- 
ferential equation has the effect of raising the degree of the characteristic equation. The 
complementary function is defined as the general solution of the reduced equation 


PMD tary) to tana +ayy =0 (16.51) 





Trying y = de’! (#0) as a solution and utilizing the knowledge that this implies 
v(t) =rde", y(t) =r? de", ..., vO) =r" Ae”, we can rewrite (16.51) as 


Ael(r® aye! +++ + agin + ay) = 0 


This cquation is satisfied by any value of r which satisfies the following (mth-degree poly- 
nomial) characteristic equation 


Meta betagir ta, =O (16.51') 


There will, of course, be # roots to this polynomial, and each of these should be included in 
the general solution of (16.51). Thus our complementary function should in general be in 
the form 


As before, however, some modifications must be made in case the 7 roots are not all real 
and distinct. First, suppose that there are repeated roots, say, r) = 12 = 13. Then, to avoid 
“collapsing,” we must write the first three terms of the solutions as 4)e""+ 
Ante + Ayt?e"™ [ef. (16.9)]. In case we have ry =r, as well, the fourth term must be 
altered to Aye, ete. 

Second, suppose that two of the roots are complex, say, 


r5,rp = hte 
5 Fh 


then the fifth and sixth terms in the preceding solution should be combined into the fol- 
lowing expression: 


e(Ascosut+ dgsinut) — [ef (16.24°)] 


By the same token, if two distinct pairs of complex roots are found, there must be two such 
trigonometric expressions (with a different set of values of h, v, and two arbitrary constants 
for each).’ As a further possibility, if there happen to be two pairs of repeated complex 
roots, then we should use ¢ as the multiplicative term for one but use fe" for the other, 
Also, even though 4 and v have identical valucs in the repeated complex roots, a different 
pair of arbitrary constants must now be assigned to each. 

Once yp and y- are found, the gencral solution of the complete equation (16.50) follows 
easily. As before, it is simply the sum of the complementary function and the particular 
integral: v(Z) = ye + ¥p. In this general solution, we can count a total of n arbitrary con- 
stants. Thus, to definitize the solution, as many as # initial conditions will be required. 





* itis of interest to note that, inasmuch as complex roots always come in conjugate pairs, we can be 
sure of having at feast one real root when the differential equation is of an odd order, i.e, when nis 
an odd number. 


542 Part Five Dynantic Analysis 


Example 1 


Find the general solution of 
yO) + Oy"(D + 14y"(t) + 16y'(t) + By = 24 
The particular integral of this fourth-order equation is simply 


Its characteristic equation is, by (16.51'), 

463 414e? + 167 +8 = 0 
which can be factored into the form 

(r+ 2Mr 4 2)r? + 2r +2) =0 


From the first two parenthetical expressions, we can obtain the double roots m1 = fz = —2, 
but the last (quadratic) expression yields the pair of complex roots r3,r4 = —14/, with 
h=-1 and v =1. Consequently, the complementary function is 


Ye = Aye?’ + Agte* +e "(As cost+ Aa sint) 
and the general solution is 
y(t) = Ape*' + Ate?" 4 e "(Ay cost + Agsint) +3 


The four constants A1, Az, Az, and Aq can be definitized, of course, if we are given four 
initial conditions. 

Note that all the characteristic roots in this example either are real and negative or are 
complex and with a negative real part. The time path must therefore be convergent, and 
the intertemporal equilibrium is dynamically stable. 


Convergence and the Routh Theorem 
The solution of a high-degree characteristic equation is not always an easy task. For this 
reason, it should be of tremendous help if we can find a way of ascertaining the conver- 
gence or divergence of a time path without having to solve for the characteristic roots. 
Fortunately, there does exist such a method, which can provide a qualitative (though nen- 
graphic) analysis of a differential equation. 

‘This method is to be found in the Routh theorem,” which states that: 


The real paris of all of the roots of the nth-degree polynomial equation 
ag" bay | 4-+ay tan, =0 


are negative if and only if the first # of the following sequence of determinants 
a ay as az] 


i ay ay as 

a ay. | do a aq Ue 
er ee a a 

woe 0 4a a 3 | 


0 a a aa} 
all are positive. 


In applying this theorem, it should be remembered that |e | = a1. Further, it is to be 
understood that we should take a, = 0 for all m > n. For example, given a third-degree 


1 For a discussion of this theorem, and a sketch of its proof, see Paul A. Samuelson, Foundations of 
Economic Analysis, Harvard University Press, 1947, pp. 429-435, and the references there cited 


Example 2 


Chapter 16 Higher-Order Differential Equations 543 


polynomial equation (7 = 3), we need to examine the signs of the first saree determinants 
listed in the Routh theorem; for thal purpose, we should set aa = as = 0. 

The relevance of this theorem 1o the convergence problem should become self-evident 
when we recall that, in order for the time path ¥(1) to converge regardless of what the ini- 
tial conditions happen to be, all the characteristic roots of the differential equation must 
have negative real parts. Since the characteristic equation ( 16.31’) is an nth-degree polyno- 
mial equation, with @y = 1, the Routh theorem can be of direct help in the testing of con- 
vergence, In fact, we note that the coefficients of the characteristic equation (16.51') are 
wholly identical with those of the given differential equation (16.51), so it is perfectly 
acceptable to substitute the coefficients of (16.51) directly into the sequence of determi- 
nants shown in the Routh theorem for testing, provided that we always take ay = 1. 
Inasmuch as the condition cited in the theorem is given on the “if and only if” basis. it 
obviously constitutes a necessary-and-sufficient condition. 


Test by the Routh theorem whether the differential equation of Example 1 has a convergent 
time path. This equation is of the fourth order, so n = 4. The coefficient are dp = 1, a = 6, 
a; = 14, a3 = 16, a = 8, and a5 = a = a7 = 0. Substituting these into the first four deter- 
minants, we find their values to be 6, 68, 800, and 6,400, respectively. Because they are all 
positive, we can conclude that the time path is convergent. 





EXERCISE 16.7 


T. Find the particular integral of each of the following: 
@ yy" + ay" O + yO + 2y=8 
) y"O+y"O+3yQ=1 
( 3y"(+9y"(Q=1 
@yO@+y'O =4 
2. Find the yp and the ye (and hence the general solution) of: 
(@) y"(8) = 2y"(— yD + 2y=4 
[Hint 2? 1 42 =(r- 10 +10 -2)] 
(b) y"() + 7y"(B + iSy(h+ 9y = 0 
[Hint: 73 + 7r? 4157 49 = (r ~1)r? + 6r 4 9)] 
(2 y"@ + 6y"(O + 10y'(- By =8 
[Hints 3 + Gr? +107 +8 =F — 4)(r? + 2r +.2)] 
3. On the basis of the signs of the characteristic roots obtained in Prob. 2, analyze the 
dynamic stability of equilibrium. Then check your answer by the Routh theorem. 
4. Without finding their characteristic roots, determine whether the following differential 
equations will give rise to convergent time paths: 
(@) y(t) ~ toy"() + 27y'(t) - 1By = 3 
(b) y"{B)~ 11 y"(t) + Bdy"(1) + dy = 5 
(9 y(t) + 4y"(O ~ Sy(O - 2y = -2 
5. Deduce from the Routh theorem that, for the second-order linear differential equation 


y"(H + ary'() + ay = b, the solution path will be convergent regardless of initial con- 
ditions if and only if the coefficients a; and a2 are both positive. 


Chapter 


17.1 




















£ 


/ 
Discrete Time: First-Order 
Difference Equations 











In the continuous-time context, the pattern of change of a variable y is embodied in the 
derivatives y'(1), »"(f). etc. The time change involved in these is occurring continuously. 
When time is, instead, taken to be a diseree variable, so that the variable /is allowed to take 
integer values only, the concept of the derivative obviously will no longer be appropriate. 
Then, as we shall see, the pattern of change of the variable y must be described by so-called 
differences, rather than by derivatives or differentials, of p(f). Accordingly, the techniques 
of differential equations will give way to those of difference equations. 

When we are dealing with discrete time, the value of variable » will change only when 
the variable ¢ changes from one inleger value to the next, such as from ¢=1 to f= 2. 
Meanwhile, nothing is supposed to happen to y. In this light, it becomes more convenient 
to interpret the values of f as referring to periods—rather than points—of time, with t = | 
denoting period 1 and ¢ = 2 denoting period 2. and so forth. Then we may simply regard y 
as having one unique value in cach time period. In view of this interpretation, the discrcte- 
time version of economic dynamics is often referred to as period analysis. It should be 
emphasized, however, that “period” is being used here not in the calendar sensc but in the 
analytical sense. Ilence, a period may involve one extent of calendar time in a particular 
economic model, but an allogether different one in another, Even in the sate model, more- 
over, éach successive period should not necessarily be construed as meaning equal calen- 
dar time. In the analytical sense, a pcriod is merely a length of time that elapses before the 
variable y undergoes a change. 





Discrete Time, Differences, and Difference Equations 





544 


The change from continuous time to diserete time produces no effect on the fundamental 
nature of dynamic analysis, although the formulation of the problem must be altered. Basi- 
cally, our dynamic problem is still to find a time path from some given pattern of change of 
a variable p over time. But the pattern of change should now be represented by the differ- 
ence quotient Ay/A¢, which is the discrete-time counterpart of the derivative dy/dt. 
Recall, however, that ¢ can now take only integer values; thus, when we are comparing the 


Chapter 17 Discrete Time: First-Order Difference Equations 545 


values of y in two consecutive periods, we must have Af = 1. For this reason, the difference 
quotient Ay:/Ar can be simplified to the expression Ay; this is called the first difference 
of y. The symbol A, meaning difference, can accordingly be interpreted as a directive to 
take the first difference of (y), As such, it constitutes the discrete-time counterpart of the 
operator symbol d/dt. 

The expression Ay can take various values, of course, depending on which two conscc- 
utive time periods are involved in the difference-taking (or “differencing”). To avoid ambi- 
guity, let us add a time subscript to y and define the first difference more specilically, as 
follows: 


Aye = Yer — (17.1) 


where y, means the value of'y in the tth period, and ¥,,; is its value in the period immedi- 
ately following the ¢th period. With this symbology, we may describe the pattern of change 
of y by an equation such as 





Ay) = (17.2) 
or 
Ay, = -0.Ly, (17.3) 


Equations of this type are called difference equations. Note the striking resemblance 
between the last two equations, on the one hand, and the differential equations ¢y/dt = 2 
and dy/dt = —0.1y on the other. 

Even though difference equations derive their name from difference expressions such as 
Ay, there are alternate equivalent forms of such equations which are completely free of A 
expressions and which are more convenient to use. By virtue of (17.1), we can rewrite 
(17.2) as 


Yai Ye =2 (17.2') 
or 
Mat =M+2 (17.2) 
For (17.3), the corresponding alternate cquivalent forms are 
Ji —0.9y, = 0 (17,3') 
or 
vi = 0.9%, (17.3") 
The double-prime-numbered versions will prove convenient when we are calculating a 
y value from a known y value of the preceding period. In later discussions, however, we 
shall employ mostly the singlc-prime-numbered versions, i.e., those of (17.2') and (17.3'). 
It is important to note that the choice of time subscripts in a difference equation is some- 
what arbitrary. For instance, without any change in meaning, (17.2') can be rewritten as 


Y —¥, 1 = 2, where (f — 1) refers to the period which immediately precedes the /th. Or, 
we may express it equivalently as y42 — ¥,41 = 2. 


546 Part Five Dynamic Analysis 


Also, it may be pointed out that, although we have consistently used subscripted y sym- 
bols, it is also acceptable to use p(z), y(t + 1), and y(t — 1) in their stead. in order to avoid 
using the notation p(t) for both continuous-time and discrete-time cases, however, we 
shall, in the discussion of period analysis, adhere to the subscript device. 

Analogous to differential equations, diflerence equations can be either linear or nonlin- 
ear, homogencous or nonhomogencous, and of the first or second (or higher) orders. Take 
(17.2) for instance. It can be classified as: (1) linear, for no y term (of any period) is raised 
to the second (or higher) power or is multiplied by a y term of another period; (2) nonho- 
mogencous, since the right-hand side (where there is no y term) is nonzcro; and (3) of the 
first order, because there cxists only a first difference Ay,, involving a onc-period time lag 
only. (In contrast, a second-order difference equation, to be discussed in Chap. 18, involves 
a two-period lag and thus cntails three y terms: v,42, ¥r41, a8 well as yr.) 

Actually, (17.2'} can also be characterized as having constant cocfficients and a constant 
term (= 2). Since the constant-coefficient case is the only one we shall consider, this char- 
acterization will henceforth be implicitly assumed. Throughout the present chapter, the 
constant-term feature will also be retained, although a method of dealing with the variable- 
term case will be discussed in Chap. 18. 

Check that the cquation (17.3’) is also linear and of the first order; but unlike (17.2’), il 
is homogeneous. 





17.2 Solving a First-Order Difference Equation 





In solving a differential equation, our objective was to find a time path y(t). As we know, 
such a time path is a function of time which is totally free from any derivative (or differen- 
tial) expressions and which is perfectly consistent with the given differential equation as 
well as with its initial conditions. The time path we seek from a difference equation is sim- 
ilar in nature. Again, it should be a function of r—a formula defining the values of y in 
every time period—which is consistent with the given difference equation as well as with 
its initial conditions. Besides, it must not contain any difference expressions such as Ay, 
(or expressions like v4) — Jy). 

Solving differential equations is, in the final analysis, a matter of integration. How do we 
solve a difference equation? 


Iterative Method 

Before developing a general method of attack, let us first explain a relatively pedestrian 
method, the iterative method—which, though crude, will prove immensely revealing of the 
essential nature of a so-called solution. 

In this chapter we are concerned only with the first-order case; thus the difference equa- 
tion describes the pattern of change of y between Avo consecutive periods only. Once such 
a pattern is specified, such as by (17.2"), and once we are given an initial value yo, it is no 
problem to find y, from the equation. Similarly, once y; is found, y2 will be immediately 
obtainable, and so forth, by repcatcd application (iteration) of the pattern of change 
specified in the difference equation. The results of iteration will then permit us to infer a 
time path, 


Example 1 


Example 2 


Example 3 


Chapter 17 Discrete Time: First-Onder Difference Equations 547 


Find the solution of the difference equation (17.2), assuming an initial value of yo = 15. To 
Carry out the iterative process, it is more convenient to use the alternative form of the 
difference equation (17.2"), namely, y41 = y +2, with yo =15. From this equation, we 
can deduce step-by-step that 

yi =yo+2 

yo=yit2=(y+2)+2 = yt 22) 


and, in general, for any period t, 
Y= yo +t(2)=1542¢ (17.4) 


This last equation indicates the y value of any time period (including the initial period 
t = 0); it therefore constitutes the solution of (17.2). 


The process of iteration is crude- it corresponds roughly to solving simple differential 
¢quations by straight integration—but it serves to point out clearly the manner in which a 
time path is generated. In general, the value ol y, will depend in a specified way on the 
value of y in the immediately preceding period (3,1); thus a given initial value yo will 
cly lead to vj, y2,..., via the prescribed pattern of change. 





Succ 


Solve the difference equation (17.3); this time, let the initial value be unspecified and 
denoted simply by yo. Again it is more convenient to work with the alternative version in 
17.3"), namely, vis. = 0.9y;,. By iteration, we have 


Vi = 0.99 
¥2 = 0.9y, = 0.9(0.9 yo) = (0.9)? yo 
Ys = 0.9 yp = 0.9(0.9)? yp = (0.9)? yo 


These can be summarized into the solution 
v= O.9)'¥o (17.5) 


To heighten interest, we can lend some economic content to this example. fn the simple 
multiplier analysis, a single investment expenditure in period 0 will call forth successive 
rounds of spending, which in turn will bring about varying amounts of income increment 
in succeeding time periods. Using y to denote income increment, we have yo = the amount 
of investment in period 0; but the subsequent income increments will depend on the 
marginal propensity to consume (MPC). If MPC = 0.9 and if the income of each period 
is consumed only in the next period, then 90 percent of yo will be consumed in period 1, 
resulting in an income increment in period 1 of yi = 0.9 yo. By similar reasoning, we can 
find yo = 0.99), etc. These, we see, are precisely the results of the iterative process cited 
previously. In other words, the multiplier process of income generation can be described by 
a difference equation such as (17.3”), and a solution like (17.5) will tell us what the magni- 
tude of income increment is to be in any time period f. 


Solve the homogeneous difference equation 


My — ny, = 0 


548 Part Five Dynamic Analysis 


Upon normalizing and transposing, this may be written as 


_ fn 
Yar =| 


which is the same as (17.3") in Example 2 except for the replacernent of 0.9 by n/m. Hence, 
by analogy, the solution should be 
nv' 
n= (2) yo 


t 
n 
Watch the term (2) , It is through this term that various values of t will lead to their 


corresponding values of y. It therefore corresponds to the expression e“ in the solutions to 
differential equations. If we write it more generally as b! (b for base) and attach the more 
general multiplicative constant A (instead of yp), we see that the solution of the general 
homogeneous difference equation of Example 3 will be in the form 


ys = Abt 
We shall find that this expression Ab‘ will play the same important role in difference equa- 
tions as the expression Ae” did in differential equations,’ However, even though both are 
exponential expressions, the former is to the base b, whereas the latter is to the base e. It 


stands to reason that, just as the type of the continuous-time path y() depends heavily on 
the value of r, the discrete-time path y; hinges principally on the value of b. 


General Method 


By this time, you must have become quite impressed with the various similarities between 
differential and difference equations. As might be conjcctured, the general method of solu- 
tion presently to be explained will parallel that for differential equations. 

Suppose that we are seeking the solution to the first-order difference equation 


Ya Fay, =e (17.6) 


where a and ¢ are two constants. The general solution will consist of the sum of two com- 
ponents: a particular solution y,, which is any solution of the complete nonhomogeneous 
equation (17.6), and a complementary function y.. which is the general solution of the 
reduced cquation of (17.6): 


Ji tay, = 0 (17.7) 


The y, component again represents the intertemporal equilibrium level of y, and the ¥ 
component, (he deviations of the time path from that equilibrium. The sum of ¥, and y, 
constitutes the general solution, because of the presence of an arbitrary constant. As before, 
in order to definitize the solution, an initial condition is needed. 

Let us first deal with the complementary function. Our experience with Example 3 
suggests that we may try a solution of the form y, = Ab' (with Ab! # 0, for otherwise yy 
will turn out simply to be a horizontal straight linc lying on the / axis); in that case, we also 


t You may abject to this statement by pointing out that the solution (17.4) in Example 1 does not 
contain a term in the form of Ab‘. This latter fact, however, arises only because in Example 1 we have 
b=njm=1/1 =1, so that the term Ab! reduces to a conslant. 


Chapter 17 Discrete Tine: First-Order Difference Equations 349 


have y,4) = 4b'+!, If these values of y, and y41 hold, the homogeneous equation (17.7) 
will become 


Ab't) + add’ =0 
which, upon canceling the nonzero common factor 45‘, yields 
bt+a=0 of b=-a@ 
This means that, for the trial solution to work, we must set 6 = —a; then the complemen- 
tary function should be written as 
yel= Ab") = A(—ay! 


Now let us search for the particular solution, which has to do with the complete equa- 
tion (17.6). In this regard, Example 3 is of'no help at all, because that example relates only 
to a homogeneous equation. However, we note that lor y,, we can choose any solution of 
(17.6); thus if trial solution of the simplest form y, = & (a constant) can work out. no real 
difficulty will be encountered. Now, if v, = 4, then y will maintain the same constant valuc 
over time, and we must have y,.; = & also. Substitution of these values into (17.6) yields 





-__£ 
lta 


Since this particular & value satisfies the equation, the particular integral can be written as 


Kktak=c and & 





Ipl= ky = mn @#-) 


This being a constant, a stationary equilibrium is indicated in this case. 

Ifit happens that @ = —1, as in Example |, however, the particular solution ¢/(1 + a) 18 
not defined, and some other solution of the nonhomogeneous equation (17.6) must be 
sought. In this event, we employ the now-familiar trick of trying a solution of the form 
}, = kt, This implies, of course, that y1 = A(t + 1), Substituting these into (17.6), we find 
= ¢ =_ 
“rtltat 
thus Ypl= kt) = ct 


K(t+ 1) +akt=¢ and ¢ (because a = —1] 


This form of the particular solution is a nonconstant function of f; it therefore represents a 
moving equilibrium, 

Adding y, and v, together, we may now write the general solution in one of the two 
following forms: 


yy = A(-ay¥ + [general solution, case ofa # —1} (17.8) 


lta 
y= A(-a)' + ct = A tet [general solution, case ofa = —1] (17.9) 
Neither of these is completely determinate, in view of the arbitrary constant 4. To climinate 


this arbitrary constant, we resort to the initial condition that y; = yo when ¢ = 0. Letting 
f= 0 in (17.8), we have 








=A+—— and A= yy 
yo + 4a an Yo ida 


550 Part Five Dynamic Analysis 


Example 4 


Consequently, the definite version of (17.8) is 


, = fry -—— }t-e) + ini i - 5 
na(n rea}! +1 [delinite solution, case ofa 4-1] (17.8') 


Letting ¢ = 0 in (17.9), on the other hand, we find vy = A, so the definite version of 
(17.9) is 


v= yo tet [definite solution, case of we = —1] (17.9) 


If this last result is applied to Example 1, the solution that emerges is exactly the same as 
the iterative solution (17.4). 

You can check the validity of each of these solutions by the following two steps. First, by 
letting ¢ = 0 in (17.8'), see that the latter equation reduces to the identity vy = yp, signify- 
ing the satisfaction of the initial condition. Second, by substituting the y, formula (17.8') 
and a similar },., formula—obtaincd by replacing ¢ with (f + 1) in (17.8'}—into (17.6), see 
that the latter reduces to the identity c = ¢, signifying that the time path is consistent with 
the given difference cquation. The check on the validity of solution (17.9') is analogous. 


Solve the first-order difference equation 
ye — Sy = 1 (v0 =3) 


Following the procedure used in deriving (17.8'), we can find y, by trying a solution 
yt = Abt (which implies yi41 = Ab‘'). Substituting these values into the homogeneous 
version y;-1 — Sy: = 0 and canceling the common factor Ab‘, we get b = 5. Thus 

ye = AGS) 


To find yp, try the solution y=, which implies yi; =k. Substituting these into the 


complete difference equation, we find k = —1. Hence 


Yp =~ 


aio 


It follows that the general solution is 
We= et Yo = AUS) 3 


Letting t= 0 here and utilizing the initial condition yo = i we obtain A = 2. Thus the 
definite solution may finally be written as 
y= 25) 4 
Since the given difference equation of this example is a special case of (17.6), with 
a=-5, C=1, and yo = Z and since (17.8') is the solution “formula” for this type of 
difference equation, we could have found our solution by inserting the specific parameter 
values into (17.8’), with the result that 
_ 7 1 oy 1 = (5)! 1 
MENG Tag] Fg 
which checks perfectly with the earlier answer. 


Note that the y,4) term in (17.6) has a unit coefficient. Jf a given dilference equation 
has a nonunit coefficient for this term, it must be normalized before using the solution 
forinula (17.8'). 


Chapter 17 Discrete Time: First-Order Difference Equations 551 





EXERCISE 17.2 


1, Convert the following difference equations into the form of (17.2"): 
(a) Ay =7 
(b) Ayr = 0.3y, 

() Ay = 24-9 
2, Solve the foltowing difference equations by iteration: 
@ yu =y-1 {yo = 10) 
(0) Yer = ay, (yo = 8) 
© yer=ay-8 (r= yo when t=0) 

3, Rewrite the equations in Prob. 2 in the form of (17.6), and solve by applying formula 
(17.8') or (17,9'}, whichever is appropriate. Do your answers check with those 
abtained by the iterative method? 

4, For each of the following difference equations, use the procedure illustrated in the 
derivation of (17.8) and (17.9') to find yc, yp, and the definite solution: 

(0) yor + 3y 2 4 (yo = 4) 
(b) 2yn1 — = 6 (= 7) 
Oyr=O2n+4 (yo =4) 


17.3 The Dynamic Stability of Equilibrium 





In the continuous-time case, the dynamic stability of equilibrium depends on the 4e”’ term 
in the complementary function. In period analysis, the corresponding role is played by the 
Ab’ term in the complementary function. Since its interpretation is somewhat more com- 
plicated than 4e", let us try to clarify it before proceeding further. 


The Significance of b 

Whether the equilibrium is dynamically stable is a question of whether or not the comple- 
mentary function will tend ta zero as t + oc. Basically, we must analyze the path of the 
term AO’ as fis increased indefinitely. Obviously, the value of 4 (the base of this exponen- 
tial term) ts of crucial importance in this regard. Let us first consider its significance alone, 
by disregarding the coefficient 4 (by assuming 4 = 1). 

For analytical purposes, we can divide the range of possible values of 6, (—o3, +90), 
into seven distinct regions, as set forth in the first two columns of Table 17.1, arranged in 
descending order of magnitude of 6. These regions are also marked off in Fig. 17.1 ona 
vertical 6 scale, with the points +1, 0, and —1 as the demarcation points, In fact, these tat- 
ter three points in themselves constitute the regions II, TV, and VI. Regions III and V, on the 
other hand, correspond to the set of all positive fractions and the set of all negative frac- 
tions, respectively. The remaining two regions, I and VII, are where the numerical valuc of 
b exceeds unity. 

In each region, the exponential expression b’ generates a different type of time path. 
These are exemplified in Table 17.1 and illustrated in Fig. 17.J. In region I (where 6 > 1), 
5 must increase with ¢ at an increasing pace, The general configuration of the time path 
will therefore assume the shape of the top graph in Fig. 17.1. Note that this graph is shown 


552 Part Five Dynamic Analysis 


TABLE 17.1 

A Classitication 
of the Vatues 
ofb 





Value of b' in Different Time Periods 





Region Value of b Valueofb' t=0 t=1 tx2 tm3 tee 
! b> (b> 1) eg, (2) 1 2 4 8 16 
W b=1 (jbl = 1) ay J 1 1 1 1 
= O<bet <1) eg. Gy 1 + } | ¢ 
Vo b=0 (Ib| = 0) (0)! 0 68 o 0 0 
v =l<b<0 (bi<t) eg, (4) 1 -2 4} -} + 
vl b=-l (Ibl = 1) or en es 1 
vil b<-1 (al>1) eg, C2! 1-2 4-8 16 





as a step function rather than as a smooth curve; this is because we are dealing wilh period 
analysis. In region II (A = 1), 6 will remain at unity for all values of 7. [ts graph will thus 
be a horizontal straight line. Next, in region II], 5' represents a positive fraction raised to 
integer powers, As the power is increased, 5' must decrease, though it will always remain 
positive, The next case, that of b = 0 in region IV, is quite similar to the case of & = 1; but 
here we have b! = 0 rather than 4’ = 1, so its graph will coincide with the horizontal axis. 
However, this case is of peripheral interest only, since we have earlicr adopted the assump- 
tion that 4b! # 0. 

When we move into the negative regions, an interesting new phenomenon occurs: The 
value of b! will alternate between positive and negative values from period to period! This 
fact is clearly brought out in the last three rows of Table 17.] and in the last three graphs of 
Fig. 17.1. In region ¥, where A is a negative fraction, the alternating lime path tends ta get 
closer and closer to the horizontal axis (cf. the positive-fraction region, III). In contrast, 
when 6 = +1 (region VI), a perpetual alternation between —1 and —I results. And finally, 
when & < —1 (region VII), the alternating time path will deviate farther and farther from 
the horizontal axis. 

What is striking is that, whereas the phenomenon of a fluctuating time path cannot pos- 
sibly arise from a single 4e"! term (the complex-root case of the second-order differential 
equation requires a pair of complex roots), fluctuation can be generated by a single >! 
(or Ab') term. Note, however, that the character of the fluctuation is somewhat different; 
unlike the circular-function pattern, the fluctuation depicted in Fig. 17.1 is nonsmooth. 
For this reason, we shall employ the word useiliation to denote the new, nonsmooth type 
of fluctuation, even though many writers do use the terms fluctuation and oscillation 
interchangeably. 

The essence of the preceding discussion can be conveyed in the following gencral state- 
ment: The time path of A! (4 4 0) will be 





Nonosciltatory | 5. b> 
Oscillatory bad 
Divergent . lal = J 
Convergent [bl <1 


[tis important to note that, whereas the convergence of the expression e”" depends on the sige 
of r, the convergence of the b! expression hinges, instead, on the absolute value of b. 


FIGURE 17.1 


Chapter 17 Discrete Time: First-Order Difference Rquations 353 


Value of b Region Configuration of b! 
— 
i} 
—_ 
| — 
L_-—— 
0 
t 
a 
FL Bete 
0 
7 
4 
— 


+~— 11> 7 


ops+—— lv ——_ 0 











-1}<- vi oe; 
— ' 

i} 

— 1 | 

—w— 96 44+ 
at 1 

—_ 








The Role of A 


So far we have deliberately left out the multiplicative constant 4. But its effects—of which 
there are two—are relatively casy to take into account. First, the magnitude of A can serve 
to “blow up” (if, say, A = 3) or “pare dawn” (if, say, 4 = 4) the values of b'. That is, it can 
produce a scale effect withoul changing the basic configuration of the time path, The sign of 
A, on the other hand, does materially affect the shape of the path because, if h' is multiplied 


554 Part Five Dynamic Analvsis 


by 4 =—1, then each time path shown in Fig. 17.1 will be replaced by its own mirror 
image with reference to the horizontal axis. Thus, a negative 4 can produce a mirror effect 
as well as a scale effect. 


Convergence to Equilibrium 
The preceding discussion presents the interpretation of the 46' term in the complementary 
function, which, as we recall, represents the deviations from some intertemporal equilib- 
rium level. Ifa term (say) yp = 5 is added to the Ab‘ term, the time path must be shifted up 
vertically by a constant value of 5, This will in no way affect the convergence or divergence 
of the time path, but it will alter the level with reference to which convergence or diver- 
gence in gauged. What Fig, 17.1 pictures is the convergence (or lack of it) of the 45! 
expression to zero, When the y,, is included, it becomes a question of the convergence of 
the time path y; = ye + yp to the equilibrium level v,. 

In this connection, let us add a word of explanation for the special case of 6 = | (region II). 
A time path such as 





¥ = AD + yp = A+ Ip 


gives the impression that it converges, because the multiplicative term (1)' = | produces 
no explosive effect, Observe, however, that p, will now take the value (4 -+ y,) rather than 
the equilibrium value y,; in fact, it can never reach y, (unless A = 0). As an illustration of 
this type of situation, we can cite the time path in (17.9), in which a moving equilibrium 
yp» = cf is involved. This time path is to be considered divergent, not because of the 
appearance of fin the particular solution but because, with a nonzero A, there will be a con- 
stant deviation from the moving equilibrium, Thus, in stipulating the condition for conver- 
gence of time path y, to the equilibrium },, we must rule out the case of b = 1. 
In sum, the solution 








w= Ad t Yp 


is a convergent path if and only if [b| < 1, 





Example 1 What kind of time path is represented by y; = 2-4)! + 9? Since b= —$ < 0, the time path 
———-—__ is oscillatory. But since |b] = 4 < 1, the oscillation is damped, and the time path converges 
to the equilibrium level of 9. 
You should exercise care not to confuse 2(— ay with =); they represent enLitely dif- 
ferent time-path configurations, 
Example 2 How do you characterize the time path y; = 3(2)' + 4? Since b = 2 > 0, no oscillation will 
——*—— _ occur, But since |b] = 2 > 1, the time path will diverge from the equilibrium level of 4. 
EXERCISE 17.3 


1. Discuss the nature of the following time paths: 
t 
(a) y= 341 O n=5(-4) +3 


) n=2(8) a) w=-3()) 42 


Chapter 17 Discrete Time: First-Order Difference Equations 355 


2. What is the nature of the time path obtained from each of the difference equations in 
Exercise 17,2-4? 

3. Find the solutions of the following, and determine whether the time paths are oscilfa- 
tory and convergent: 
@ya-h=6  (=1) 
©) yi t2yp=9 (% =4) 
© yitines x2) 
@) your — a3 (% = 5) 


17.4. The Cobweb Model 


To illustrate the use of first-order difference equations in economic analysis, hall cite 
two variants of the market model for a single commodity. The first variant, known as the 
cobweb model, differs from our earlier market models in that it treats Q, as a function not 
of the current price but of the price of the preceding time period. 


The Model 
Consider a situation in which the producer's output decision must be made one period in 
advance of the actual sale—such as in agricultural production, where planting must pre- 
cede by an appreciable length of time the harvesting and sale of the output. Let us assume 
that the output decision in period ¢ is based on the then-prevailing price P,. Since this 
output will not be lable for the sale until period (¢ + 1), however, P, will determine 
not Qu, but Q,;-1. Thus we now have a “lagged” supply function.' 

Ov = SCP) 
or, equivalently, by shifting back the time subscripts by one period, 

Os = SP 1) 
When such a supply function interacts with a demand function of the form 

Oa = DCPs) 


interesting dynamic price patterns will result. 

Taking the linear versions of these (lagged) supply and (unlagged) demand functions, 
and assuming that in each time period the market price is always set at a level which clears 
the market, we have a market model with the following three equations: 


Qa = Osi 
Ou =a - BP, (a, B > 0) (17.10) 
O.=-yt+5P-1  (y,6>9) 











* We are making the implicit assumption here that the entire output of a period will be placed on the 
market, with no part of it held in storage. Such an assumption is appropriate when the commodity in 
question is perishable or when no inventory is ever kept. A model with inventory will be considered 
in Sec. 17.5. 


556 Part Five Dynamic Analysis 
By substituting the last two equations into the first, however, the model can be reduced to a 
single first-order difference equation as follows: 
BP +OR 4 =a+y 


In order to solve this equation, it is desirable first to normalize it and shift the time sub- 
scripts ahead by one period [alter t to (¢ + L), ete.]. The result, 





é a+y 
Pit aha (17.11) 
will then be a replica of (17.6), with the substitutions 
é aty 
y=P a=— and c= 
B B 


Inasmuch as 6 and f are both positive, it follows that a # — |. Consequently, we can apply 
formula (17.8’), to get the time path 


_(p etry (8) ety 
na(a wo) 5) tari (17.12) 





where Py represents the initial price. 


The Cobwebs 

Three points may be observed in regard to this time path. In the first place, the expression 
(a + y)/(B +8), which constitutes the particular integral of the difference equation, can 
be taken as the intertemporal equilibrium price of the model:’ 


aty 
B+é 


Pe 





Because this is a constant, it is a stationary equilibrium. Substituting P into our solution, 
we can express the time path P, alternatively in the form 


P, = (Py— P) (-5) +P 17.12’) 


This leads us to the second point, namely, the significance of the expression (#) — P), 
Since this corresponds to the constant 4 in the 4d! term, its sign will bear on the question 
of whether the time path will commence above or below the equilibrium (mirror effect), 
whereas its magnitude will decide how far above or below (scale eflect), Lastly, there is the 
expression (—3/8), which corresponds to the 6 component of Ab‘. From our model spec- 
ification that 6, 6 > 0, we can deduce an oscillatory time path. It is this fact which gives 
rise to the cobweb phenomenon, as we shall presently sec. There can, of course, arise three 


As far as the market-clearing sense of equilibrium is concerned, the price reached in each period is 
an equilibrium price, because we have assumed that Qu: — Qu for every t. 


FIGURE 17.2 


Chapter 17 Discrete Time: First-Order Difference Equations 587 


a>p é<B 
Q (5 steeper than D) Q (S$ flatter than D) 

















possible varieties of oscillation patterns in the model. According to Table 17.1 or Fig. 17.1, 
the oscillation will be 


Explosive 
Uniform if 3 Ea B 
Damped 


where the term uniform oscillation tefers to the type of path in region V1. 

In order to visualize the cobwebs, let us depict the model (17.10) in Fig. 17,2. The sec- 
ond equation of (17.10) plots as a downward-sloping lincar demand curve, with its slope 
numerically equal to 8. Similarly, a linear supply curve with a slope equal to 4 can be drawn 
from the third equation, if we let the Q axis represent in this instance a Jagged quantity sup- 
plied. The case of 8 > f (S steeper than D} and the case of 6 < B (S flatter than D) are 
illustrated in Fig. 17.2a and 6, respectively. In either case, however, the intersection of 2 
and S will yield the intertemporal equilibrium price P. 

When 8 > £, a3 in Fig. 17.2a, the interaction of demand and supply will produce an 
explosive oscillation as follows. Given an initial price Py (here assumed above P), we can 
follow the arrowhead and read off on the S curve that the quantity supplied in the next 
period (period 1) will be Q). In order to clear the market, the quantity demanded in petiod 
| must also be 2), which is possible if and only if price is set at the level of P; (sce down- 
ward arrow). Now, via the S curve, the price P, will lead to Q2 as the quantity supplied in 
period 2, and to clear the market in the latter period, price must be sct at the level of P 
according to the demand curve. Repeating this reasoning, we can trace out the prices and 
quantities in subsequent periods by simply following the arrowheads in the diagram, 
thereby spinning a “cobweb” around the demand and supply curves. By comparing the 
price levels, Py, Pi, P),..., we observe in this case not only an oscillatory pattern of 
change but also a tendency for price to widen its deviation from P as time goes by. 
With the cobweb being spun from inside out, the time path is divergent and the oscillation 
explosive. 








558 Part Five Dynamic Analysis 


By way of contrast, in the case of Fig. 17.2b, where 6 < . the spinning process will 
ercate a cobweb which is centripetal. From Po, if we follow the arrowhcads, we shall be 
ted ever closer to the interscetion of the demand and supply curves, where P is. While still 
oscillatory, this price path is convergent, 

In Fig. 17.2 we have not shown a third possibility, namely, that of § = £. The procedure 
of graphical analysis involved, however, is perfectly analogous to the other two cases. I is 
therefore lef to you as an exercise. 

The preceding discussion has dealt only with the time path of P (thal is, P,): after P, is 
found, however, it takes but a short step to get to the time path of @. The second equation 
of (17,10) relates Oy, to P,, $0 if (17.125 or (17.12’) ts substituted into the demand equa- 
tion, the time path of Og, can be obtained immediaiely. Morcover, since 0, must be equal 
to Oy in cach time period (clearance of market}, we can simply refer to the time path as Q, 
rather than Q.y. On the basts of Fig. 17.2, the rationale of this substitution is casily seen. 
Each point on the D curve relates a P; to. a Q; pertaining to the same time period; therefore, 
the demand function can serve te map the time path of price into the time path of quantity. 

You should note that the graphical technique of Fig. 17.2 is applicable even when the D 
and § curves are nonlinear, 








EXERCISE 17.4 


1, On the basis of (17.10}, find the time path of Q, and analyze the condition for its 
convergence, 

2. Draw a diagram similar to those of Fig. 17,2 to show that, for the case of 5 = , the 
price will oscillate uniformly with neither damping nor explosion. 

3. Given demand and supply for the cobweb model as follows, find the intertemporal 
equilibrium price, and determine whether the equilibrium is stable: 
(@) Qa =18-3P, Qua -3+4Pia 
(6) Qu =22-3Pp Quan 2+ Pra 
() Que =19-6P, Qe =6PL.1-5 

4. in mode! (17.10), let the Qu = Qs, condition and the demand function remain as they 
are, but change the supply function to 
Qs = —y ~ PP 
where P* denotes the expected price for period t. Furthermore, suppose that sellers 
have the “adaptive” type of price expectation: 
Pra Pry tn(Pei- Phy) <1) 
where n (the Greek letter eta) is an expectation-adjustment coefficient. 
(a) Give an economic interpretation to the preceding equation. tn what respects is it 

similar to, and different from, the adaptive expectations equation (16.34)? 
(b) What happens if 7 takes its maximum value? Can we consider the cobweb model 
as a special case of the present model? 





* See Marc Nerlove, “Adaptive Expectations and Cobweb Phenomena,” Quarterly journal of 
Economics, May 1958, pp. 227-240. 


Chapter 17 Discrete Time: First-Order Difference Eyuutions 559 


(Q) Show that the new madel can be represented by the first-order difference equation 


: 
etre eae 


(Hint: Solve the supply function for P/, and then use the information that 
Qs = Qe = a ~ BP.) 

(d) Find the time path of price. Is this path necessarily oscillatory? Can it be oscillatory? 
Under what circumstances? 

(e) Show that the time path P,, if oscillatory, will converge only if 1 ~ 2/n < ~8/f. As 
compared with the cobweb solution (17.12) or (17.12), does the new model have 
a wider or narrower range for the stability-inducing values of —3/8? 

5. The cobweb model, like the previously encountered dynamic market models, is essen- 
tially based on the static market madel presented in Sec. 3.2. What economic assump- 
tion is the dynamizing agent in the present case? Explain. 





17.5 A Market Model with Inventory 


In the preecding model, price is assumed to be set in such a way as to clear the current out- 
put of every time period. The implication of that assumption is either that the commodity 
is a perishable which cannot be stocked or that, though it is stockable, no inventory is ever 
kept. Now we shall construct a model in which sellers do keep an inventory of the 
cornmodity. 


The Model 


Let us assume the [ollowing: 





1. Both the quantity demanded, Qq,. and the quantity currently produced, Q.,, are 
untagged linear functions of price F,. 

2. The adjustment of price is effected not through market clearance in every period, but 
through a process of price-setting by the sellers: At the beginning of cach period, the 
sellers set a price for that period after taking into consideration the inventory situation. 
If, as a result of the preceding-period price, inventory accumulated, the current-period. 
price is set at a lower level than before, in order to “move” the merchandise; but if 
inventory decumulated instead, the current price is set higher than before. 

3. The price adjustment made from period to period is inversely proportional to the 
observed change in the inventory (stock). 


With these assumptions, we can write the following equations: 
Oar = — BP, (a, B > 0) 
Qn = —y +6P, (y. 6 > 0) (17.13) 
Pp = Pr-O — On) (2 > 9) 


where o denotes the stock-induced-price-adjustment coefficient. Note that (17.13) is really 
nothing but the discrete-time counterpart of the market model of Sec. 15.2, although we 
have now couched the price-adjustment process in terms of inventory (Q,, — Qy,) rather 


560 Part Five Dynamic Analysis 


TABLE 17.2 
Types of Time 
Path 


than excess demand (Qa: — Ost). Nevertheless, the analytical results will turn out to be 
much different; for one thing, with discrete time, we may encounter the phenomenon of 
oscillations, Let us derive and analyze the time path P,. 


The Time Path 
By substituting the first two equations into the third, the model can be condensed into a 
single difference equation: 


P| —[l-o(B+8)|P. =o(a-y) (17.14) 


and its solution is given by (17.89): 





afp—7t4 AK i ot” 
r= (A al ab FAT + Bs 
=(Py— Pifl-o(B +o} +P (17.15) 


Obviously. therefore, the dynamic stability of the model will hinge on the expression 
1 = o(f +8); for convenience, Ict us refer to this expression as 4. 

With reference to Table 17.1, we see that, in analyzing the exponential expression 5’, 
seven distinct regions of b values may be defined. However, since our model specifications 
(c, B, 8 > 0) have effectually ruled out the first two regions, there remain only five possi- 
ble cases, as listed in Table 17.2. For each of these regions, the 4 specification of the second 
column can be translated into an equivalent o specification, as shown in the third column. 
For instance, for region Il, the 6 specification is 0 < b < 1; therefore, we can write 


O<1-a(f+8) <1 
-1<-o(f+4) <0 — [subtracting | from all three paris] 
1 











and Pai so>d [dividing through by —(8 + 4)] 
Value of 

Region: ‘ba T~ off + 8) Value of o Nature of Time Path P, 

i Q<b<1 O<o< oo Nonoscillatory and convergent 
Vv b=0 75 ys Remaining in equilibriunn’ 

Vv -1<b<0 ay <o< mS With damped oscillation 

vi b=-1 o= ie 5 With uniform oscillation 
Vil b<-1 o> <a With explosive oscillation 

b+8 





1 the fact that price will be:remsining in equitibrium in this case can alsn be seca direcdy from (17.14), With a = 1/¢6 + 3), the 
coefficient of P; becomns zero, and (17.14) reduces to P,,y-= ofa + y) (a + 98+ 8) = P. 


Example 1 


FIGURE 17,3 


Chapter 17 Discrete Time: First-Order Difference Equations 561 


This last gives us the desired equivalent @ specification for region IT]. The translation for 
the other regions may be carried out analogously. Since the type of time path pertaining to 
each region is already known from Fig. 17.1, the o specification cnables us to tell from 
given values of ¢, 6, and 4 the general nature of the time path P,, as outlined in the last col- 
umn of Table 17.2. 


If the sellers in our mode! always increase (decrease) the price by 10 percent of the amount 
‘of the decrease (increase) in inventory, and if the demand curve has a slope of —1 and the 
supply curve a slope of 15 (both slopes with respect to the price axis), what type of time 
path P; will we find? 

Here, we have o = 0.1, 8 = 1, and 6 = 15. Since 1/(B +8) = 74 and 2/(# +4) = 3, the 
value of ¢ {= 15) lies between the former two values; it is thus a case of region V. The time 
path P; will be characterized by damped oscillation, 


Graphical Summary of the Results 

The substance of Table 17.2, which contains as many as five different possible c 
specification, can be made much casier to grasp if the results are presented graphically. 
Inasmuch as the o specification involves essentially a comparison of ihe relative magni- 
tudes of the parameters o and (8 + 4), Ict us plot o against (8 + 4), as in Fig. 17.3. Note 
that we need only concern ourselves with the positive quadrant because, by model specifi 
cation, o and (f + 6) are both positive. From Table 17,2, it is clear that regions 1V and VI 
are specified by the equations o = 1/(8 +4) and o = 2/(B + 8), respectively. Since each 
of these plots as a rectangular hyperbola, the two regions are graphically represented by the 
two hyperbolic curves in Fig. 17.3. Once we have the two hyperbolas, moreover, the other 
three regions immediately fall into place. Region IT], for instance, is merely the set of points 
lying below the lower hyperbola, where we have a less than 1/(8 + 8). Similarly, region V 
is represented by the set of points falling between the two hyperbolas, whereas all the points 
located above the higher hyperbola pertain to region VII. 


ofe 










Region VIL 


I+ =_l 
"Bra 
[Region IV} 








562 Part Five Dynainic Analysis 


Example 2 


fo = 4, 6 =1, and § = 3, will our model (17.13) yield a convergent time path P,? The 
given parametric values correspond to point A in Fig. 17.3. Since it falls within region V, the 
time path is convergent, though oscillatory. 


You will note thal, in the two modcls just presented, our analytical results are n cach 
instance stated as a set of alternative possible cases:—three types of oscillatory path for 
the cobwebs, and five types of time path in the inventory model. This richness of analytical 
results stems, of course, from the parametric formulation of the models. The fact that our 
result cannot be stated in a single uncquivocal answer is, of course, a merit rather than a 
weakness. 





EXERCISE 17.5 


1. In solving (17.14), why should formula (17.8') be used instead of (17.9°)? 
2. On the basis of Table 17.2, check the validity of the translation from the b specification 
to the « specification for regions IV through Vil. 
3. If model (17.13) has the following numerical form: 
Qa = 21-2P, 
Qu = -3 + OP; 
Pra = Pr-0.3(Qs - Qa) 
find the time path ?; and determine whether it is convergent. 


4. Suppose that, in model (17.13), the supply in each period is a fixed quantity, say, 
Qer = k, instead of a function of price. Analyze the behavior of price over time. What 
restriction should be imposed on k to make the solution economically meaningful? 


17.6 Nonlinear Difference Equations—The 
Qualitative-Graphic Approach 





Thus far we have only utilized finear difference equations in our models; but the facts of 
economic life may not always acquicsce to the convenience of linearily. Fortunately, when 
nonlinearity occurs in the case of first-order difference-equation models, there exists an 
easy method of analysis that is applicable under fairly general conditions. This method, 
graphic in nature, closely resembles that of the qualitative analysis of first-order differen- 
tial equations presented tn Sec. 15.6. 


Phase Diagram 
Nonlinear difference equations in which only the variables y,,; and s; appear, such as 


Yat ¥ =3 or ye) tsinpy, ln y = 3 
can be categorically represented by the equation 
J = FO) (17.16) 


where f'can be a function of any degree of complexity, as long as it is a function of y, alone 
without tas another argument. When the two variables y,41 and y, ate plotted against each 


FIGURE 17.4 


Chapter 17 Discrete Time; First-Order Difference Equations 563 


other in a Cartesian coordinate plane, the resulting diagram constitutes a phase diagram, 
and the curve corresponding to fis a phase fine. From these, it is possible to analyze the 
time path of the variable by the process of iteration. 

The terms phase diagram and phase line are uscd here in analogy to the differential- 
equation case; but note one dissimilarity in the construction of the diagram. [n the differential- 
equation case, we plotted dy/d¢ against y as in Fig. 15.3, so that, in order to be perfectly 
analogous in the present case, we should have Ay; on the vertical axis and y; on the hori- 
zontal. This is not impossible to do, but it is much more convenient to place 41 on the ver- 
tical axis instead, as we have done in Fig, 17.4 where the same scale is used on both axes. 
Note the presence of a 45° line in cach diagram of Fig. 17.4; this line will prove to be of 
great service in carrying oul our graphic analysis, 

Let us illustrate the procedure involved by means of Fig. 17.4a, where we have drawn a 
phase line (labeled /,} representing a specific difference equation y,_. = fi(y,). If we are 





Hl Yat 











¥, ¥etl 








564 Part Five Dynamic Analysis 


given an initial value yy (plotted on the horizontal axis), by iteration we can trace out all the 
subsequent values of y as follows. First, since the phase line /, maps the initial value yo 
into y; according to the equation 


v1 = filo) 


we can go straight up from yp to the phase line, hit point 4, and read its height on the ver- 
tical axis as the value of 1. Next, we seek to map y inlo y» according to the equation 


r= fi 


For this purpose, we must first plot y; on the horizontal axis --similarly to yy during the 
first mapping. This required transplotting of y; from the vertical axis to the horizontal is 
most casily accomplished by the use of the 45° line, which, having a slope of +1, is the 
locus of points with identical abscissa and ordinate, such as (2, 2) and (5. 5). Thus, to 
transplot y, from the vertical axis, we can simply go across to the 45° linc, hit point B, and 
then turn straight down to the horizontal axis to locate the point yi. By repeating this 
process, we can map y; to 2 via point C on the phase line, and then use the 45° line for 
transplotling y2, etc. 

Now that the nature of the iteration is clear, we may observe that the desired iteration can 
be achieved simply by following the arrowheads from yy to A (on the phase linc), to 8 (on 
the 45° line), to C (on the phase line), ctc—always alternating between the two lines— 
without it ever being necessary to resort to the axes again. 





Types of Time Path 

The graphic iterations just outlined are, of course, equally applicable to the other three 
diagrams in Fig. 17.4. Actually, these four diagrams serve to illustrate four basic varictics 
of phasc lines, each implying a different type of time path. The first two phase lines, f; and 
fa, are characterized by positive slopes, with onc slope being less than unity and the other 
one greater than unity: 


O< ff <1 and SO) >1 
The remaining two, on the other hand, arc negatively sloped; specifically, we have 
-l< fi<0 and fa) <1 


In cach diagram of Fig, 17.4, the intertemporal equilibrium value of p (namely ¥) ts 
located at the intersection of the phase line and the 45° line, which we have labeled E. This 
is so because the point £ on the phase linc, being simultaneously a point on the 45° line, 
will map av, into a y,4, of identical value; and when y;-| = ¥;, by definition » must be in 
equilibrium intertemporally. Our principal task is to determine whether, given an initial 
value yo # ¥, the pattern of change implied by the phase line will Icad us consistently 
toward ¥ (convergent) or away from it (divergent). 

For the phase line f, the itcrative process leads from yp to ¥ in a steady path, without 
oscillation. You can verify that, if yo is placed to the right of ¥, there will also be a steady 
movement toward ¥, although it will be in the leftward direction, These time paths are con- 
vergent to equilibrium, and their general configurations would be of the same type as 
shown in region III of Fig. 17.1. 


Chapter 17. Discrece Time: First-Order Difference Equations 565 


Given the phase line f:, whose slope exceeds unity, however, a divergent time path 
emerges. From an initial value yy greater than y, the arrowheads lead steadily away from 
the equilibrium to higher and higher y values. As you can verify, an initial value lower than 
¥ gives rise to a similar steady divergent movement, though in the opposite direction. 

When the phase linc is negatively inclined, as in fy and f4, the steady movement gives 
way to oscillation, and there appears now the phenomenon of overshooting the equilibrium 
mark. In diagram c, yo leads to yi, which exceeds ¥, only to be followed by y2, which falls 
short of ¥, etc. The convergence of the time path will, in such cases, depend on thc slope of 
the phase line being less than 1 in its absolute value. This is the case of the phase line fy, 
where the extent of overshooting tends to diminish in successive periods. For the phase line 
Jo, whose slope exceeds | numerically, on the other hand, the opposite tendency prevails, 
resulting in a divergent time path. 

The oscillatory time paths generated by phase lines /; and f; are reminiscent of the cob- 
webs in Fig, 17.2, In Fig. 17.4¢ or d, however, the cobweb is spun around a phase line 
(which contains a lag) and the 45° line, instead of around a demand curve and a (lagged} 
supply curve. Here, a 45° line is used as a mechanical aid for transplotting a value of v, 
whereas in Fig. 17.2, the 2 curve (which plays a role similar to that of the 45° line in 
Fig, 17.4) is an integral part of the model itself. Specifically, once Q,, is determined on the 
supply curve, we Ict the arrowheads hit the D curve for the purpose of finding a price that 
will “clear the market,” as was the rule of the game in the cobweb model. Consequently, 
there is a basic difference in the labeling of the axes: in Fig. 17.2 there are two entirely dif- 
ferent variables, P and Q, but in Fig, 17.4 the axes represent the values of the same variable 
¥ in two consecutive periods, Note however, that if we analyze the graph of the difference 
equation (17.11) which summarizes the cobweb model, rather than the separate demand 
and supply functions in (17.10), then the resulting diagram will be a phase line such as 
shown in Fig, 17,4, In other words, theze really exist two alternative ways of graphically 
analyzing the cobweb model, which will yield the identical result. 

The basi¢ rule emerging from the preceding consideration of the phase line is that the 
algebraic sign of its slope determines whether there will be oscillation, and the absolute 
value of ils slope governs the question of convergence. If the phase line happens to contain 
both positively and negatively sloped segments, and if the absolute value of its slope is at 
some points greater, and clscwhere less, than 1, the time path will naturally beeome more 
complicated. However, even in such cases, the graphic-iterative analysis can be employed 
with equal ease. Of course, an initial value must be given to us before the iteration can be 
duly started. Indeed, in these more complicated cases, a different initial valuc can Icad to a 
time path of an altogether different breed (see Exercises 17.6-2 and 17.63). 






A Market with a Price Ceiling 
We shall now cite an economic example of a nonlinear difference equation. In Fig. 17.4, the 
four nonlinear phase lines all happen to be of the smooth variety; in the present example, 
we shall show a nonsmooth phase tine. 

As a point of departure, let us take the /inear difference equation (17.11) of the cobweb 
model and rewrite it as 


aty § § ) 
Pay =— - -F, —~>() 17.17 
a (; - (7.17) 


566 Part Five Dynamic Analysis 


FIGURE 17.5 


Pray 















1 
i 
| | 1 
1 ‘ 1 
1 1 1 
hoot ! 
1 1 1 
\ 1 \ 
o }—+-+ + 
Book OP P 


This is in the format of P,4; = f(P,), with f"(P,) = —-6/8 < 0. We have plotted this 
linear phase line in Fig. 17.5 on the assumption that the slope is greater than | in absolute 
valuc, implying explosive oscillation. 

Now let there be imposed a legal price ceiling P (read: “P caret" or, less formally, “P 
hat”). This can be shown in Fig. 17.5 as a horizontal straight line because, irrespective of 
the level of P,, P.4; is now forbidden to exceed the level of B. What this does is to invali- 
date that part of the phase line lying above P or, to view it differently, to bend down the 
upper part of the phase line to the level of P, thus resulting in a kinked phase linc. in 
view of the kink, the new (heavy) phase line is not only nonlinear but nonsmooth as well. 
Like a step function, this kinked linc will require more than one equation to express it 
algebraically: 

B (for P, <4) 


Pa = or oe (for P, > ) (17.17’) 


where & denotes the value of P; at the kink. 

Assuming an initial price Po, let us trace out the time path of price iteratively. During the 
first stage of iteration, when the downward-sloping segment of the phase line is in effect, 
the explosive oscillatory tendency clearly manifests itself. After a few periods, however, the 
arrowheads begin to hit the ceiling price, and thereafter the time path will develop into a 
perpetual cyclical movement between P and an effective price floor P (read: “P tilde” 
or, less formally, “P wiggle”). Thus, by virtue of the price cciling, the intrinsic explosive 
tendency of the model is effectively contained, and the evcr-widening oscillation is now 
tamed into a uniform oscillation producing a so-called limit cycle. 


' Strictly speaking, we should also “bend” that part of the phase line lying to the right of the point 
B on the horizontal axis. But it does no harm to leave it as it is, as long as the other end has already 
been bent, because the transplotting of P:41 to the horizontal axis will carry the upper limit of P 
over to the P; axis automatically. 


Chapter 17 Diserete Time: First-Order Difference Equations 567 


What is significant about this result is that, whereas in (he case of a linear phase line a 
uniformly oscillatory path can be produced if and only if the slope of the phase line is —1, 
now after the introduction of nonlinearity the same analytical result can arise even when the 
phase line has a slope other than +1, The economic implication of this is of considerable 
import. If one observes a more or less uniform oscillation in the actual time path of a vari- 
able and attempts to explain it by means ofa /inear model, one will be forced to rely on the 
rather special—and implausible model specification that the phase-line slope is exactly 
1, But if nonlinearity is introduced, in cither the smooth or the nonsmooth variety, then a 
host of more reasonable assumptions can be used. cach of which can equally account for 
the observed feature of uniform oscillation. 











EXERCISE 17.6 


1, in difference-equation mociels, the variable t can only take integer values. Does this 
imply that in the phase diagrams of Fig. 17,4 the variables y, and y;.. must be consid- 
ered as discrete variables? 

2. As a phase line, use the left half of an inverse U-shaped curve, and let it intersect the 
45° line at two points L (left) and R (right). 

(a) Is this a case of multiple equilibria? 

(b) If the initial value yo lies to the left of £, what kind of time path will be obtained? 

(¢) What if the initial value lies between / and R? 

(d) What if the initial value lies to the right of R? 

{é) What can you conclude about the dynamic stability of equilibrium at { and at R, 
respectively? 

3, As a phase line, use an inverse U-shaped curve. Let its upward-stoping segment inter- 
sect the 45° line at point 1, and let its downward-stoping segment intersect the 45° line 
at point 2, Answer the same five questions raised in the Prob. 2. (Note: Your answer will 
depend on the particular way the phase line is drawn; explore various possibilities.) 

4. In Fig. 17.5, rescind the legal price ceiling and impose a minimum price Pm instead. 
{a) How will the phase line change? 

{b) Will it be kinked? Nonlinear? 
(©) Will there also develop a uniformly oscillatory movement in price? 

5. With reference to (17.17’) and Fig. 17.5, show that the constant k can be expressed as 

ary Be 
k= a} P 





568 














Higher-Order 
Difference Equations 


The economic models in Chap. 17 involve difference equations that relate P, and to 
each other. As the P value in one period can uniquely determine the P value in the next, the 
time path of P becomes fully determinate once an initial value Py is specified. [t may hap- 
pen, however, that the value of an economic variable in period t (say, y;) depends not only 
on y,—; but also on y, 2. Such a situation will give rise to a difference cquation of the 
second order, 

Strictly speaking, a second-order difference equation is one that involves an expression 
A*y,, called the second difference of v,, but contains no differences of order higher than 2. 
The symbol A’, the discrete-time counterpart of the symbol d?/d?”, is an instruction to 
“take the second difference” as follows: 


A?y, = A(Ay;) = AQi41 ~ Yi) [by 7.0] 
=(-2— ye) — Orn 30d [again by (17.]t 
= yep — ti + Ye 


Thus a second difference of v, is transformable into a sum of terms involving a two-period 
time lag. Since expressions like A?y, and Ay, are quite cumbersome to work with, we shall 
simply redefine a second-order difference equation as one involving a two-period time lag 
in the variable, Similarly, a third-order difference equation is one that involves a three- 
period time lag, ctc. 

Let us first concentrate on the method of solving a second-order difference equation, 
leaving the generalization to higher-order equations in Section 18.4. To keep the scope of 
discussion manageable, we shall only deal with linear difference equations with constant 
coefficients in the present chapter. However, both the constant-term and variable-term vari- 
eties will be examined. 


+ That is, we first move the subscripts in the (+41 — ys) expression forward by one period, to get 
anew expression (yi+2 — Yet), and then we subtract from the latter the original expression. Note 
that, since the resulting difference may be written as Ayii; — Aye, we may infer the following rule 
of operation: 

AQ — ye) = Ayer — Av 


This is reminiscent of the rule applicable to the derivative of a sum or difference. 


Chapter 18 Higher-Order Difference Equations 569 


18.1 Second-Order Linear Difference Equations 
with Constant Coefficients and Constant Term 





Example 1 


Example 2 


A simple variety of second-order difference equations takes the form 
Jerat ay tay, Se (18.1) 


You will recognize this equation to be lincar, nonhomogeneous, and with constant coefli- 
cients (a), @2) and constant term ¢. 


Particular Solution 
As before, the solution of (18.1) may be expected to have two components: a particular 
solution v, representing the intertemporal equilibrium Ievel of y, and a complementary 
function y, specifying, for every time period, the deviation from the equilibrium, ‘The 
particular solution, defined as any solution of the complete equation, can sometimes be 
found simply by trying solution of the form y, = 4. Substituting this constant value of y 
into (18.1), we obtain 
k= —. 

Itata 
Thus, so long as (1 + @ + a2) # 0, the particular integral is 

¢ 


y,(= k) = ————_ f yf — . 
¥p(= &) Tiadas (case of @) +a, 4-1) (18.2) 


k+ajktakoe and 


Find the particular integral of ¥r12 —3ys41 +4y: = 6. Here we have a = -3, a2 = 4, and 
c= 6. Since a + a # —1, the particular solution can be obtained from (18.2) as follows: 
6 


=~ 3 
WT 344 


In case 4; +4) = —1, then the trial solution y, =& breaks down, and we can try 
vy = At instead. Substituting the latter into (18.1) and bearing in mind that we now have 
Year = A(t + 1) and y..2 = k(t + 2), we find that 


ke +2) +a kt + t+ aki =e 
c Cc 


k= —___________ = —_ since a) + a = —I 
(tataytta+2 a42 I ' ] 


and 


Thus we can write the particular solution as 


e 
w= EN = 5 





f (case of a, + ay = —l;a, #-2)  (18.2') 


Find the particular solution of yr42 + yi41 — 2y = 12. Here, a) = 1, a2 = -2, and c= 12. 
Obviously, formula (18.2) is not applicable, but (18.2') is. Thus, 
12 
= ——t=4 
Yp T+? t 


This particular sclution represents a moving equiliorium. 


570 Part Five Dynamtic Analysis 


Example 3 


Ifa, + a2 = —1, butat the same time a, = —2 (that is, if) = —2 and az = 1), then we 
can adopt a trial solution of the form y, = &, which implies yy = k(¢ + 1)*, ete. As you 
may verify, in this case the particular solution turns out to be 





yaks se (case of a; = —2: 42 = 1) (18.2”) 


However, since this formula applies only to the unique case of the difference equation 
Jrar — 2Ve4. + Ye = ¢, its usefulness is rather limited. 


Complementary Function 
To find the complementary function, we must concentrate on the reduced equation 


Jat aya + eh = (18.3) 


Our experience with first-order difference equations has taught us that the expression Ab! 
plays a prominent role in the general solution of such an equation. Let us therclore try a 
solution of the form y, = 4b‘, which naturally implies that y,4, = 4b" +l and so on. It is 
our task now to determine the valucs of 4 and b. 

Upon substitution of the trial solution into (18.3), the equation becomes 


AB? + a) Ab +a A = 0 
or, after canceling the (nonzero) common factor Ab’, 
Ptabta =o (18.3’) 
This quadratic equation the characteristic equation of (18.3} or of (18. 1}—which is com- 
parable to (16.4"), possesses the two characteristic roots 
-ayt Vv a — 4a, 
2 

each of which is acceptable in the solution 44‘. In fact, both 6, and by should appear in the 
general solution of the homogeneous difference equation (18.3) because, just as in the case 
of differential equations, this general solution must consist of two linearly independent 
parts, cach with its own multiplicative arbitrary constant. 

Three possible situations may be encountered in regard to the characteristic roots, 


depending on the square-root expression in (18.4). You will find these parallel very closely 
the analysis of second-order differential cquations in Sec. 16.1. 


bby = (18.4) 


Case 1 (distinct real roots) When a} = 4az, the square root in (18.4) is a real number, 
and , and by are real and distincl. In that event, b, and 45 are iinearly independent, and the 
complementary function can simply be written as a linear combination of these expres- 
sions; that is, 

ye = Abt + Aad (18.8) 
You should compare this with (16.7). 


Find the solution of yey2 + yij1 — 2ye = 12. This equation has the coefficients a = 1 and 
ay = --2; from (18.4), the characteristic roots can be found to be by, bz = 1, —2. Thus, the 
complementary function is 


Yo a= Ay(1)' + An(-2)' = Ar + Ar(-2)' 


Example 4 


Chapter 18 Higher-Order Difference Equations S74 


Since, in Example 2, the particular solution of the given difference equation has already 
been found to be yp = 4t, we can write the general solution as 
Ye = Ye+ Vp = At + An(—2)' + 4t 


There are still two arbitrary constants A; and Az to be definitized; ta accomplish this, nwo 
initial conditions are necessary. Suppose that we are given yo = 4 and y, = 5. Then, since 
by letting t = 0 and t = 1 successively in the general solution we find 


yo = Ai + Az (= 4 by the first initial condition) 
yi=Al,—242+4 (= 5 by the second initial condition) 


the arbitrary constants can be definitized to A; = 3 and Az = 1. The definite solution then 
can finally be written as 


Y= 34+(-2)' +4t 


Case 2 (repeated real roots) When a? = 4a, the square root in (18.4) vanishes, and the 
characteristic roots are repeated: 


b(=b) = hy) = -f 


Now, if we express the complementary function in the form of (18,5), the two components 
will collapse into a single term: 

Ab) + Any = (Ay + An)Bt = AG! 
This will not do, because we are now short of one constant. 

To supply the missing component—which, we recall, should be linearly independent of 
the term 435° .-the old trick of multiplying ' by the variable r will again work. The new 
component term is therefore to take the form 44zb'. That this is linearly independent of 
Ayb! should be obvious, for we can never obtain the expression 44th’ by attaching a con- 
stant coefficient to 43h', That 44th! does indeed qualify as a solution of the homogencous 
equation (18.3), just as 435’ does, can easily be verified by substituting y, = 44th! [and 
Va = Ag(t + 1)6', etc_] into (18,3) and seeing that the latter will reduce to an identity 
0=0. 

The complementary function for the repeated-root case is therefore 

yy = Axbt + Agi! (18.6) 
which you should compare with (16.9). 


Find the complementary function of y;2 + 6y41 + 9¥; = 4. The coefficients being a) = 6 
and a = 9, the characteristic roots are found to be b; = bz = - 3. We therefore have 


Yo = Ag(=3)) + Agt(—3) 


if we proceed a step further, we can easily find yp = 4, 50 the general solution of the 
given difference equation is 


ye = Ax(—3)' + Aat(-3)' + 5 
Given two initial conditions, Az and Aq can again be assigned definite values. 


"In this substitution it should be kept in mind that we have in the present case a? = 4a) and 
b=-m/2. 


572 Part Five Dynamic Analysis 


Case 3 (complex roots) Under the remaining possibility of a? < 4a, the characteristic 
roots are conjugate complex, Specifically, they will be in the form 


bby shi 


where 





a 
-—7 -y' 18.7 
h 2 and ov 5 (18.7) 


The complementary function itself thus becomes 
yo = Ab + Anbh = A+ vt + An(h — vty’ 


As it stands, », is not easily interpreted. But fortunately, thanks to De Moivre’s theorem, 
given in (16.23'), this complementary function can easily be transformed into trigonomet- 
ric terms, which we have Icarned to interpret. 

According to the said theorem, we can write 


(he vi) = R'(cosét +i sin dt) 


where the value of & (always taken to be positive) is, by (16.10), 








2 
4 
r-Veses [o* i Ja (18.8) 
and @ is the radian measure of the angle in the interval [0, 27), which satisfies the 
conditions 
A -ay . uv a? 
so-—= and é@é=—=/1-— (189 
COS Rive an sin 1 ia (18.9) 


Therefore, the complementary function can be transformed as follows: 


Ye = ALR! (cos Ot +i sin Of) + Az R'(cos Ot — i sin BL) 
= RCA + Aa) cost + (A, — Aa)i sin Of] 
= R'(As cos@t + Ag sin@t) (18,10) 


where we have adopted the shorthand symbols 
As=A,+ Az and Ag = (A, — Avi 


The complementary function (18.10) differs from its differential-equation counterpart 
(16.24') in two important respects. First, the expressions cos @f and sin have replaced the 
previously used cos vt and sin vt. Second, the multiplicative factor R‘ (an exponential with 
base R) has replaced the natural exponential expression e" In short, we have switched 
from the Cartesian coordinates (f and v) of the complex roots to their polar coordinates 
{R and @). The values of R and @ can be determined from (18,8) and (18.9) once A and v 
become known. It is also possible to calculate R and ¢ directly from the parameter values 
a, and ay via (18.8) and (18.9), provided we first make certain that @7 < 4a; and thal the 
roots are indeed complex. 


Example 5 


Example 6 


Chapter 18 Higher-Order Difference Eyuations $73 


Find the general solution of Yuatty = 5. With coefficients a) =0 and az = i, this 
constitutes an illustration of the complex-root case of af < 4a). By (18.7), the real and 
imaginary parts of the roots are h = 0 and v = } It follows from (18.8) that 


i 2 
i 1 1 
R= =) as 
(ora) =3 
Since the value of @ is that which can satisfy the two equations 


A v 
Q=—= d sin@=— =i 
cos R QO an sin R 


it may be concluded from Table 16.1 that 
oo 
2 
Consequently, the complementary function is 


t 
1 = _ 
w= (3) (ssc git Ag sin 51) 


To find yp, fet us try a constant solution y, = k in the complete equation. This yields 
k= 4; thus, yp = 4, and the general solution can be written as 


t 
1 x a 
y= (3) (« cos st ssn +4 (18,11) 


Find the general solution of ¥4,2- 4y:,1 + 16y; = 0. In the first place, the particular solu- 
tion is easily found to be yp = 0. This means that the general solution y; (= ye + ¥p) will be 
identical with yc. To find the latter, we note that the coefficients a) = —4 and a = 16 do 
produce complex roots. Thus we may substitute the a and a» values directly into (18.8) 
and (18.9) to obtain 





R=V16=4 
4 1 
cosé = F473 and 
The last two equations enable us to find from Table 16.2 that 
x 
gan 
3 


tt follows that the complementary function—which also serves as the general solution 
here—is 


v= w= 4{ascorges “ein (18.12) 


The Convergence of the Time Path 

As in the case of first-order difference equations, the convergence of the time path y, hinges 

solely on whether y, tends toward zero as ¢ > 00. What we learned about the various con- 

figurations of the expression b', in Fig, 17.1, is therefore still applicable, although in the 

present context we shall have to consider ave characteristic roots rather than one. 
Consider first the case of distinct real roots: B) # 5). Lf {b)| > L and |b2| > 1, then 

both component terms in the complementary function (18.5}—4, 6) and 4,54 will be 


574 Part Five Dynamic Analysis 


Example 7 


explosive, and thus y, must be divergent. In the opposite case of |b,| < 1 and |b| < 1, both 
terms in y, will converge toward zero as ¢ is indefinitely increased, as will y. also. What if 
|b, | > 1 but [b2| < 1? In this intermediate case, it is evident that the 424 term tends to 
“die down,” while the other term tends to deviate farther {rom zero. It follows that the Ab} 
term must eventually dominate the scene and render the path divergent. 

Let us call the root with the higher absofute value the dominant root. Then it appears that 
it is the dominant root 2, which really sets the tone of the time path, at Icast with regard to its 
ultimate convergence or divergence. Such is indeed the case. We may state, (hus, that a time 
path will be convergent—whatever the initial conditions may be—ifand only if the dominant 
root is less than | in absolute value. You can verify that this statement is valid for the cases 
where both roots are greater than or less than | in absolute value (discussed previously), and 
where one root has an absolute value of | exactly (nof discussed previously). Note, however, 
that even though the eventual convergence depends on the dominant root alone, the non- 
dominant root will exert a definite influence on the time path, too, at least in the beginning 
periods, Therefore, the exact configuration of ¥7 is still dependent on both roots. 

Turning to the rcpeated-root case, we find the complementary function to consist of the 
terms 436! and 44th‘, as shown in (18.6). The former is already familiar to us, bul a word 
of explanation is still needed for the latter, which involves a multiplicative ¢. If |b] > 1, the 
&! term will be explosive, and the multiplicative # will simply serve to intensify the explo- 
siveness as r increases. If |h| < 1, on the other hand, the b' part (which tends to zero as t in- 
creases) and the ¢ part will run counter to each other; i.¢., the value of ¢ will offset rather 
than reinforce 6'. Which force will prove the stronger? The answer is that the damping 
force of b' will always win ovet the exploding force of r. For this reason, the basic require- 
ment for convergence in the repcated-root case is still thal (he root be less than 1 in absolute 
value. 





Analyze the convergence of the solutions in Examples 3 and 4. For Example 3, the solution is 
y= 3-4-2)’ + 4t 


where the roots are 1 and 2, respectively [3(1)' = 3], and where there is a moving equi- 
librium 4¢. The dominant root being —2, the time path is divergent. 
For Example 4, where the solution is 


1 
ye = As(—3)' + Agt(—3)' + q 
and where |b| = 3, we also have divergence. 


Let us now consider the complex-root case. From the general form of the complemen- 
tary function in (18.10), 


¥, = RAs coset + Ag sin dt) 


it is clear that the parenthetical expression, like the one in (16,24"), will produce a fluctuat- 
ing pattern of a periodic nature. However, since the variable ¢ can only take integer values 
0, 1,2, ... in the present context, we shall catch and utilize only a subset of the points on 
the graph of a circular function. The » value at each such point will always prevail for a 
whole period, till the next relevant point is reached, As illustrated in Fig. 18.1, the resulting 
path is neither the usual oscillatory type (not alternating between values above and below 


FIGURE 18.1 


Example 8 


Chapter 18 Higher-Order Difference Equations 575 


x 








$n IN consecutive periods), nor the usual fluctuating type (not smooth); rather, it displays a 
sort of stepped fluctuation. As far as convergence is concerned, though, the decisive factor 
is really the R' term, which, like the e*" term in (16.24’), will dictate whether the stepped 
fluctuation is to be intensified or mitigated as ¢ increases. In the present case, the fluctua- 
tion can be gradually narrowed down if and only if R < 1. Since 2 is by definition the 
absolute value of the conjugate complex roots (A + vi), the condition for convergence is 
again that the characteristic roots be less than unity in absolute value. 


To summarize: For all three cases of characteristic roots, the time path will converge to 
a (stationary or moving) intertemporal equilibrium-—regardless of what the initial condi- 
tions may happen to be—if and only if the absolute value of every root is less than 1. 


Are the time paths (18.11) and (18.12) convergent? tn (18.11) we have & = 4; therefore 
the time path will converge to the stationary equilibrium (= 4). In (18.12), on the other 
hand, we have R = 4, so the time path will not converge to the equilibrium (= 0), 








EXERCISE 18.1 


1. Write out the characteristic equation for-each of the following, and find the character- 
istic roots: 
1 1 1 
() Yura — Yout + 5H = 2 (Q Wart phe 5h=5 


(8) yz — Aven +4 =? (2) yer2 — 2ye41 +3 = 4 

2. Far each of the difference equations in Prob. 1 state on the basis of its characteristic 
roots whether the time path involves oscillation or stepped fluctuation, and whether it 
is explosive. 

3. Find the particular solutions of the equations in Prob. 1. Do these represent stationary 
or moving equilibria? 

4. Solve the following difference equations: 


? 
@) oat 3 —gre9 (w= 6 "= 3) 
() yu2-2ynrt+2n=t = 3 n= 4) 


1 
O Year vt gr=2  (o=4in=7) 
5. Analyze the time paths obtained in Prob. 4. 


576 Part Five Dynamic Analysis 


18.2 Samuelson Multiplier-Acceleration Interaction Model 





As an illustration of the use of second-order difference cquations in economics, let us cite 4 
classic work of Professor Paul Samuelson, the first economist to win the Nobel Prize. We 
refer to his classic interaction model, which seeks to explore the dynamic process of income 
determination when the acceleration principle is in operation along with the Keynesian mul- 
tiplicr.’ Among other things, that model scrves to demonstrate that the mere interaction of the 
multiplier and the accelerator is capable of generating cyclical Nuctuations endogenously. 


The Framework 

Suppose that national! income Y, is made up of three component expenditure streams: con- 
sumption C;, investment J,, and government expenditure G,. Consumption is envisaged as 
a function not of current income but of the income of the prior period, Y;_; for simplicity, 
it is assumed that C, is strictly proportional to Y,_,. Investment, which is of the “induced” 
variety, is a function of the prevailing trend ot consumer spending. {tis through this induced 
investment, of course, that the acceleration principle enters into the model. Specifically, we 
shall assume J, to bear a fixed ratio to the consumption increment AC,_, 7» — €)-,. The 
third component, G,, on the other hand, is taken to be exogenous; in fact, we shall assume 
it to be a constant and simply denote it by Gp. 

These assumptions can be translated into the following set of equations: 





¥,=C +4465 

C= yh (Q<y<) (18.13) 

L=o(C,-Cn) {a> 0) 
where y (the Greek letter gamma) represents the marginal propensity to consume. and a 
stands for the accelerator (short for acceferation coefficient). Note that, if induced invest- 
ment is expunged from the model, we are left with a first-order difference equation which 
embodies the dynamic multiplier process (cf. Example 2 of Sec. 17.2). With induced 
investment included, however, we have a second-order difference equation that depicts the 


interaction of the multiplier and the accelerator. 
By virtuc of the second equation, we can express /; in terms of income as follows: 


L=oy%a-y%-2) =av-1 — Ya) 


Upon substituting this and the C, cquation into the first equation in (18.13) and rearrang- 
ing, the model can be condensed into the single equation 


¥—yU+a)¥) tayl, 2= Ge 
or. equivalently (after shifting the subscripts forward by two periods), 
Yaga —y talk tay¥, = Go (18.14) 


Because this is a second-order linear difference equation with constant coefficients and 
constant term, it can be salved by the method just learned. 


t Paul A. Samuelson, “Interactions between the Multiplier Analysis and the Principle of Acceleration,” 
Review of Economic Statistics, May 1939, pp. 75-78; reprinted in American Economic Association, 
Readings in Business Cycle Theory, Richard D. Irwin, Inc., Homewood, IIL, 1944, pp. 261-269. 


FIGURE 18.2 


Chapter 18 Higher-Order Difference Equations 577 


The Solution 
As the particular solution, we have, by (18.2), 
Go Go 
ylt+ateay i-y 

Tt may be noted that the expression 1/(1 — y) is merely the multiplier that would prevail in 
the absence of induced investment. Thus Go/(1 — y}—the exogeneous expenditure item 
times the multiplicr—should give us the equilibrium income Y* in the sense that this in- 
come level satisfies the equilibrium condition “national income = total expenditure” [cf 
(3.24)]. Being the particular solution of the model, however, it also gives us the intertem- 
poral equilibrium income ¥. 

With regard o the complementary function, there are three possible cases, Case 1 
(a? > 4ag), in the present context, is characterized by 








yuteyr>day or  y(l tay > 4a 
or 
4a 

> 

(l+ey 
Similarly, to characterize Cases 2 and 3, we only need to change the > sign in the last 
inequality to = and <, respectively. In Fig. 18.2, we have drawn the graph of the equation 
y + 4a/(1 +a). According to the preceding discussion, the («, y) pairs thal ate located 
exactly on this curve pertain to Case 2. On the other hand, the (a, y) pairs lying above this 
curve (involving higher y values) have to do with Case 1, and those lying befow the curve 
with Case 3. 

This tripartite classification, with its graphical representation in Fig. 18.2, is of interest 
because it reveals clearly the conditions under which cyclical fluctuations can emerge 


¥ 








+ (marginal propensity to consume) 





@ (acvelerator) 





1C EE] Stable; no cycles ID Unstable; no cycles 
2C £7 Suabie; no cycles 2D “\, Unstable: no cycles 


3C G3 Damped stepped fluctuation 31) [£7] Explosive stepped fluctuation 
3D “sw, Uniform stepped fluctuation 








578 Part Five Dynamic Analysis 


endogenously from the interaction of the multiplier and the accelerator. But this tells noth- 
ing about the convergence or divergence of the time path of Y. It remains, thercfore, for us 
to distinguish, under each case, between the damped and the explosive subcases. We could, 
of course, take the easy way out by simply illustrating such subcases by citing specific 
numerical examples. But let us attempt the more rewarding, if also more arduous, task of 
delineating the general conditions under which convergence and divergence will prevail. 


Convergence versus Divergence 
The difference equation (18.14) has the characteristic equation 


P-y(l+@btay =0 
which yields the two roots 


yeas yl +a} - 4ay 
bbs We 


Since the question of convergence versus divergence depends on the values of b; and 42, 
and since 6; and 4, in tam, depend on the values of the parameters a and y, the conditions 
for convergence and divergence should be expressible in terms of the values of a and y. 
To do this, we can make use of the fact that—by (16.6) ~the two characteristic roots are 
always related to each other by the following two equations: 


bj +b = y(i ta) (18.15) 
biby = ay (18.15') 


One the basis of these two equations, we may observe that 


(1 bi) = bp) = 1 — (b, + 82) + yb 
=l-y(l+a)+ay=1-y (18.16) 


In view of the model specification that 0 < y < 1, it becomes necessary to impose on the 
two roots the condition 


0<(1—b)(1—b) <1 (18.17) 


Let ys now examine the question of convergence under Case 1, where the roots are real 
and distinct. Since, by assumption, a and y arc both positive, (18.15') tells us that 
bib, > 0, which implies that 5) and b possess the same algebraic sign. Furthermore, since 
y(1 +a) > 0, (18.15) indicates that both 4; and 4) must be positive. Hence, the time path 
¥, cannot have oscillations in Case 1. 

Even though the signs of 6; and ) are now known, there actually exist under Case | as 
many as five possible combinations of (4), 2) values, each with its own implication 
regarding the corresponding values for a and y: 


@ O<b <b <1 = O<y<lay<I 


(i O<b <b =1 > yeH=l 
(ii) O<b<l<bh > yo) 
(uv) 1S kh <b => peal 
@) Leb <b > O<ya<hayprl 


TABLE 18.1 
Cases and 
Subcases of the 
Samuelson 
Model 


Chapter 18 Highv-Order Biflerence Equations 579 


Possibility i, where both 4; and fy are positive fractions, duly satisfies condition (18.17) 
and hence conforms to the model specification 0 < y < 1. The product of the two roots 
must also be a positive fraction under this possibility, and this, by (18.15), implies that 
ay < 1. In contrast, the next three possibilities all violate condition (18.17) and result in 
inadmissible y values (see Exercise 18.2-3). Hence they must be ruled out. But Possibility 
uv may still be cenabe With both >, and 2 greater than one, (18.17) may still be satis- 
fied if (1 —))() — 42) < 1. But this time we have ay » | (rather than < 1) from 
(18.15), whe uno is that there are only two admissible subcases under Case [. The 
first---Possi i—involves fractional roots and 49, and therefore yields a convergent 
time path of ¥. The other subease—Possibility » features roots greater than one. and thus 
produces a divergent time path. As far as the values of and y are concerned, however, the 
question of convergence and divergence only hinges on whether wy < | oray > |. This 
information is summarized in the top part of Table 18.1, where the convergent subcase is 
labeled 1C, and the divergent subcase 1D. 

The analysis of Case 2. with repeated roots, is similar in nature. The roots are now 
b= (1 +a@)/2, with a positive sign because w and y are positive. Thus, there is again no 
oscillation. This time we may classify the value of 4 into three possibilities only: 








Qi) O<b< => y<hey<! 
(wii) b=1 => yell 
(viii) b>] + y<lay>! 


Under Possibility v/, b (= 6; = 42) is a positive fraction; thus the implications regarding a 
and y are entirely identical with those of Possibility i under Case 1. in an analogous manner, 
Possibility viii, with b (= &) = b3) greater than one, can satisfy (18.17) only if | < 4 < 2: 
if so, it yields the same results as Possibility v. Qn the other hand, Possibility ei violates 
(18.17) and must be ruled out. Thus there are again only two admissible subcases. 7 
first—Possibility vi—yields a convergent time path, whereas the other—Possibilily vit 
gives a divergent one. In terms of a and 7, the convergent and divergent subcases are again 
associated, respectively, with ay < 1] anday > 1. These results are listed in the middle part 
of Table 18.1, where the two subcases are labeled 2C (convergent) and 2D (divergent). 














Values of 
Case Subcase and y Time Path Y, 
1. Distinct real roots 
5 te W:O<be<br<1 9 ap<d Nonoscillatory and 
y (1 +0)2 1D: 1 <b <b; ay > nonfluctuating 
2. Repeated real roots 
_ 4a 20:0<b<1 ay <1 Nonoscillatory and 
Y= yay 2D: b>1 ay>l nonfluctuating 
3. Complex roots 
4a 3C:R <1 ay<t With stepped 


YS Faye 3D: R>1 ay >1 fluctuation 





580 Part Five Dynamic Analysis 


Example 1 


Example 2 


Finally, in Case 3, with complex roots, we have stepped fluctuation, and hence endoge- 
nous business cycles. In this case, we should look to the absolute value X = /a2 
[see (18.8)] for the clue to convergence and divergence, where a2 is the coefficient of the y, 
term in the difference equation {18.1}. In the present model, we have R = /@7, which 
gives rise to the following three possibilities: 


() R<1l => eya«l 
@) R=1 > oy=l 
GQ) R>1l > ay>l 


Even though all of these happen to be admissible (see Exercise 18.2-4), only the R < 1 
possibility entails a convergent time path and qualifies as Subcase 3C in Table 18.1. The 
other two are thus collectively labeled as Subcase 3D. 

In sum, we may conclude from Table 18.] that a convergent time path can occur if and 
only ifay <1, 


A Graphical Summary 

The preceding analysis has tcsulted in a somewhat complex classification of cases and 
subcases. It would help to have a visual representation of the classilicatory scheme. This is 
supplied in Fig. 18.2. 

The set of all admissible (a, y) pairs in the model is shown in Fig. 18.2 by the variously 
shaded rectangular area. Since the values of y = 0 and y = | are excluded, as is the value 
a = 0, the shaded area is a sort of rectangle without sides. We have already graphed the 
equation y = 4or/(i + @)* to mark off the three major cascs of Table 18.1: The points on 
that curve pertain to Case 2: the points lying to the north of the curve (representing highet y 
values) belong to Case 1; those lying to the south (with lower y values) are of Case 3. To 
istinguish between the convergent and divergent subcases, we now add the graph ofey = | 
a rectangular hyperbola) as another demarcation line. The points lying to the north of this 
ectangular hyperbola satisfy the inequality wy > 1, whereas those located below it corre- 
jondto wy < 1. Itis then possible to mark off the subcases casily. Under Case |, the broken- 
ine shaded region, being below the hyperbola, corresponds to Subcase IC, but the solid-line 
aded region is associated with Subease 119, Under Case 2, which relates to the points lying 
on the curve y = 4ar/(1 4 @)?, Subcase 2C covers the upward-sloping portion of thai curve, 
and Subcase 2D, the downward-sloping portion. Finally, for Case 3, the rectangular hyperbola 
serves to separate the dot-shaded region (Subcase 3C) from the pebble-shaded region 
(Subcase 3D). The latter, you should note, also includes the points located on the rectangular 
hyperbola itself, because of the weak inequality in the specification ay > 1. 

Since Fig. 18.2 is the repository of all the qualitative conclusions of the madel, given 
any ordered pair (a, y}, we can always find the correct subcase graphically by plotting the 
ordered pair in the diagram. 











If the accelerator is 0.8 and the marginal propensity to consume is 0.7, what kind of inter- 
action time path will result? The ordered pair (0.8, 0.7) is located in the dot-shaded region, 
Subcase 3C; thus the time path is characterized by damped stepped fluctuation. 


What kind of interaction is implied by a = 2 and y = 0.5? The ordered pair (2, 0.5) lies ex- 
actly on the rectangular hyperbola, under Subcase 3D. The time path of ¥ will again display 
stepped fluctuation, but it will be neither explosive nor damped. By analogy to the cases of 


Chapter 18 Higher-Order Difference Equations 581 


uniform oscillation and uniform fluctuation, we may term this situation as “uniform stepped 
fluctuation.” However, the uniformity feature in this latter case cannot in general be expected 
to be a perfect one, because, similarly to what was done in Fig. 18.1, we can only accept 
those points on a sine or cosine curve that correspond to integer values of t, but these values 
of f may hit an entirely different set of points on the curve in each period of fluctuation. 





EXERCISE 18.2 

1. By consulting Fig, 18.2, find the subcases to which the following sets of values of a and 
y pertain, and describe the interaction time path qualitatively. 
(0) a = 3.5; y =08 ( a=0.2; y =09 
(ba =2; y=0.7 (dja =15;7=06 

2. From the values of # and y given in parts (a) and (c) of Prob. 1, find the numerical val- 
ues of the characteristic roots in each instance, and analyze the nature of the time path, 
Do your results check with those obtained earlier? 


4. Show that in Case 3 we can never encounter y 21. 


18.3 Inflation and Unemployment in Discrete Time 





The interaction of inflation and unemployment, discussed earlier in the continuoys-time 
framework, can also be couched in discrete time. Using essentially the same econymig 
assumptions, we shal] illuswate in this section how that model can be reformulated as a 
difference-equation model. 


The Model 
The earlier continuous-time formulation (Sec. 16.5) consisted of three differential 
equations: 


p=a-T—pU+ gn [expectations-augmented 
Phillips relation] (16.33) 


a. 

= =j(p-7) [adaptive expectations] (16.34) 
dU . 

a —k(m — p) [monetary policy] (16.35) 


Three endogenous variables are present: p (actual rate of inflation), (expected rate of 
inflation), and U (rate of unemployment), As many as six parameters appear in the model: 
among these, the parameter m—the rate of growth of nominal money (or, the rate of mon- 
etary exparsion)—differs from the others in that its magnitude is set as a policy decision. 

When cast into the period-analysis mold, the Phillips relation (16,33) simply becomes 


p=a-T—-BU,+g7 (a,f>0:0<g61) (18.18) 


In the adaptive-expectations equation, the derivative must be replaced by a difference 
expression: 


M41 — ™ = f(D —™) @<jsl) (18.19) 


582 Part Five Dynamic Analpsis 


By the same token, the monetary-policy equation should be changed tot 
Ua — Up = (or prt) (kh > 0) (18.20) 


These three equations constitute the new version of the inflation-unemployment model. 


The Difference Equation in p 

As the first step in the analysis of this new model, we again try to condense the model into 
a single equation in a single variable. Let that variable be p. Accordingly, we shall focus 
our attention on (18.18). However, since (18.18)—unlike the other two equations -does 
not by itself describe a pattern of change, it is up to us to create such a pattern. This is 
accomplished by differencing p,, i.c., by taking the first difference of p,, according to the 
definition 


AP; = Pixi — Pr 
‘Two steps are involved in this, First, we shift the time subscripts in (18.18) forward one 
period, to get 
Pop =ae— T-BU, y+ 9m (18.18') 

Then we subtract (18.18) from (18.18'), to obtain the first difference of p, that gives the 
desired pattern of change: 
Pri — Pr= BU — Up) + ety — 7) 

= pktm —pateie—m) — [by (18.20) and (18.19)] (18.21) 


Note that, on the second line of (18.21), the patterns of change of the other two variables as 
given in (18.19) and (18.20) have been incorporated into the pattern of change of the p vari- 
able. Thus (18.21) now embodies all the information in the present model. 

However, the 7, term is extrancous to the study of p and needs to be climinated from 
(18.21), To that end, we make usc of the fact that 


gt =pe-(a~T)+ BU; — [by 18.18] (18,22) 
Substituting this into (18.21) and collecting terms, we obtain 
(1+ Ab)pai [1 — fC — gil a + JBU, = Bk + j(@— 7) (18.23) 


But there now appears a U; term to be eliminated. To do that, we difference (18.23) to get 
a (Ui, — U;) term and then use (18.20) to eliminate the latter. Only after this rather 
lengthy process of substitutions, do we get the desired difference equation in the p variable 
alone, which, when duly normalized, takes the form 


, tte tC = sl + AM) 1—jU-g)  _ jpkm 
Pra 1+ Bk IT TE Bk PO 1 Bk 


in a e 





(18.24) 





* We have assumed that the change in U; depends on (m— pi4.1), the rate of growth of real money 
in period (¢ + 1). As an alternative, it is possible to make it depend on the rate of growth of real 
money in period t, (m— pr) (see Exercise 18.3-4) 


Chapter 18 Higher-Order Difference Equations 583 


The Time Path of p 


The intertemporal equilibrium value of p, given by the particular integral of (18.24), is 


€ jpkm 
l4+ajta phi 
As in the continuous-time model, therefore, the equilibrium rate of inflation is cxactly 
equal to the rate of monetary expansion. 
As to the complementary function, there may arise either distinct real roots (Case 1), or 
repeated real roots (Case 2), or complex roots (Case 3), depending on the relative magni- 
tudes of a? and 4a. In the present model, 


B= [by (18.2)] 


{24 ff [ltgf + (1- s+ pyP 


Me 
ES 4{1— jg) + Bk) (18.25) 


Ig =}, / =} and Bk = 5, for instance, then a7 = (54)? whereas day = 20; thus Case | 
results. But if g = j = 1, then a = 4 while 4a, = 4(1 + BA) > 4, and we have Case 3 
instead. In view of the larger number of parameters in the present model, however, it is not 
feasible to construct a classificatory graph like Fig. 18.2 in the Samuelson model, 

Nevertheless, the analysis of convergence can still proceed along the same line as in 
Sec, 18.2. Specifically, we recall from (16.6) that the two characteristic roots 5; and 62 must 
satisfy the following two relations: 


a 





] 
bth=—m= yee tl-j>0 (18.26) 
1a [see (18.24)] 

JU g) 
b)b2 = ay = ———__—= € (0, | 18.26" 
12 = az T+ pk € (0, 1) ( ) 

Furthermore, we have in the present model 
(1—b))(1— by) = 1 — (6) to) +b) = Bik og (18.27) 
1+ pk 


Now consider Case 1, where the two roots 4, atid 4) are real and distinct. Since theit 
product 4,52 is positive, b; and 5) must take the same sign. Because their sum is positive, 
moreover, b; and 4; must both be positive, implying that no oscillation can occur. 
From (18.27), we can infer that neither 5; nor 52 can be equal to one; for otherwise 
(1 — ))(1 — 82) would be zero, in violation of the indicated inequality. This means that, in 
terms of the various possibilities of (;, 42) combinations enumerated in the Samuelson 
model, Possibilities if and iv cannot arise here. It is also unacceptable to have one root 
greater, and the other root less, than one; for otherwise (1 — 4,)(1 — 2) would be negative. 
Thus Possibility ii is ruled out as well. It follows that £, and #2 must be either both greater 
than one, af both less than one. If By > | and d) > 1 (Possibility v), however, (18.26') 
would be violated. Hence the only viable eventuality is Possibility i, with b, and 4) both 
being positive fractions, so that the time path of p is convergent. 

The analysis of Case 2 is basically not much different. By practically identical reason- 
ing, we can conclude that the repeated root 4 can only turn out to be a positive fraction in 
this model; that is, Possibility u/ is feasible, but not Possibilities vii and vii. The time path 
of p in Case 2 is again nonoscillatory and convergent. 





584 Part Five Dynamic Analysis 


For Case 3, convergence requires that R (the absolute value of the complex roots) be less 
than one. By (18.8), 2 = ./az. Inasmuch as @ is a positive fraction [see (18.26')], we do 
have R < 1. Thus the time path of p in Case 3 is also convergent, although this time there 
will be stepped fluctuation. 


The Analysis of U 

If we wish to analyze instead the time path of the rate of unemployment, we may take 
(18.20) as the point of departure. To get rid of the p term in that equation, we first substi- 
tute (18. 18") to get 


(1+ Bis —U, = Ma -T—m)+hym)) (18.28) 


Next, to prepare for the substitution of the other equation, (18.19), we difference (18.28) to 
find that 


(1 + Bk WUias — (2+ BU pr + Us = kelote2 — xi) (18.29) 


In view of the presence of a difference expression in on the right, we can substitute for it 
a forward-shifted version of the adaptive-expectations equation. The result of this, 


(1+ BRyUi2 — 2+ B41 + U, = kei (Pit = te) (18.30) 
is the embodiment of all the information in the modet. 
However, we must eliminate the p and variables before a proper difference cquation 
in U will emerge. For this purpose, we note from (18.20) that 
Koya = U4) — Uy +h (18.31) 
Moreover, by multiplying (18.22) through by (~k/) and shifting the time subscripts, we 
can write 
—kigm 1 = —Kjpisi thio — 7) — BRU 
= — Uri — U, + km) + hia — 7) — BRU is 
[by (18.31)] 
= J+ BU + JU, + kia — Tm) (18.32) 
These two results express p,4, and 7,41 in terms of the U’ variable and can thus enable us, 


on substitution into (18.30), to cbtain—at long last!—the desired difference equation in the 
U variable alone: 





, Leaf t-te, | 1-jU=28),, 
42 1+ Bk Uri + T+ BE U, 
_ Ajle-T-(C-gim] 
= Tee (18.33) 


It is noteworthy that the two constant coefficients on the left (a) and a») are identical 
with those in the difference equation for p [i.c., (18.24)]. As a result, the earlier analysis of 
the complementary function of the p path should be equally applicable to the present con- 
text, But the constant term on the right of (18.33) does differ from that of (18.24). Consc- 
quently, the particular solutions in the two situations will be different. This is as it should 
be, for, coincidence aside, there is no inherent reason to expect the intertemporal cquilib- 


rium rate of unemployment to be the same as the equilibrium rate of inflation. 


Chapter 18 Higher-Order Difference Equations 585 


The Long-Run Phillips Relation 


It is readily verified that the intertemporal cquilibrium rate of unemployment is 
= 1 
Ua gle PU gym) 


But since the equilibrium rate of inflation has been found to be p = mr, we can link U to > 
by the equation 

Us pe -7 (1-9) (18.34) 
Because this equation is concerned only with the equilibrium tates of unemployment and 
inflation, it is said to depict the foag-run Phillips relation 

A special case of (18.34) has received a great deal of attention among economists: the 
case of g = 1. Ifg = |, the p term will have a zero coefficient and thus drop out of the pic- 
ture. In other words, U will become a constant function of 7. In the standard Phillips dia- 
gram, where the rate of unemployment is plotted on the horizontal axis, this outcome gives 
rise to a vertical long-run Phillips curve. The {/ value in this case, referred to as the natura! 
rate of unemplaymeni, is then consistent with any equilibrium rate of inflation, with the no- 
table policy implication that, in the long run, there is no trade-off between the twin evils of 
inflation and unemployment as exists in the short run. 

But what ify < 17 Jn that event, the coefficient of 7 in (18.34) will be negative. Then the 
long-run Phillips curve will turn out to be downward-sloping, thereby still providing a trade- 
off relation between inflation and unemployment. Whether the long-run Phillips curve is 
vertical or negatively sloped is, therefore, critically dependent on the value of the g parame- 
ter, which, according to the expectations-augmented Phillips relation, measures the extent to 
which the expected rate of inflation can work its way into the wage structure and the actual 
rate of inflation. All of this may sound familiar to you. This is because we discussed the topic 
in Example 1 in Sec. 16.5, and you have also worked on it in Exercise 16,5-4. 











EXERCISE 18.3 


1. Supply the intermediate steps leading trom (18.23) to (18.24). 

2. Show that if the model discussed in this section is condensed into a difference equation 
in the variable z, the result will be the same as (18.24) except for the substitution of 7 
for p. 

3. The time paths of p and Uin the model discussed in this section have been found to be 
consistently convergent. Can divergent time paths arise if we drop the assumption that 
g <1? If yes, which divergent “possibilities” in Cases 1, 2, and 3 will now become 
feasible? 

4. Retain equations (18.18) and (18.19), but change (18.20) to 
Up — Uy = Kin — px) 

(a) Derive a new difference equation in the variable p. 

(b) Does the new difference equation yield a different p? 

(©) Assume that j = g=1. Find the conditions under which the characteristic roots 
will fail under Cases 1, 2, and 3, respectively, 

(d) Let { = g = 1. Describe the time path of p (including convergence or divergence) 
when pk = 3, 4, and 5, respectively, 





586 Part Five Dynamic Analysis 


18.4 Generalizations to Variable-Term and 
Higher-Order Equations 





Example 1 


We are now ready to extend our methods in two directions, to the variable-term case and to 
difference equations of higher orders, 


Variable Term in the Form of cm! 

When the constant term c in (18.1) is replaced by a variable term—some function of ¢ —the 
only effect will be on the particular solution. (Why?) To find the new particular solution, we 
can again apply the method of undetermined coefficients. In the differential-equation con- 
text (Sec, 16.6), that method requires that the variable term and its successive derivatives 
together take only a finite number of distinct types of expression, apart from multiplicative 
constants. Applied to difference equations, the requirement should be amended to read: “the 
variable term and its successive differences must together take only a finite number of dis- 
tingt expression types, apart from multiplicative constants.” Let us illustrate this method by 
concrete examples, first taking a variable term in the form cm‘, where c and m ate constants. 


Find the particular solution of 
Yoo + Yen — 3 = 7" 

Here, we have c= 1 and m=. First, let us ascertain whether the variable term 7? yields a 
finite number of expression types on successive differencing. According to the rule of 
differencing (Ay: = yin1 — yp), the first difference of the term is 

AP =71-7 =7-1)7 =679' 
Similarly, the second difference, A?(7), can be expressed as 

A(A7) = A6(7") = 6(7)''| — 6(7)' = 6(7 — 197! = 3607)! 


Moreover, as can be verified, all successive differences will, like the first and second, be 
some multiple of 7/, Since there is only a single expression type, we can try a solution 
yr = B(7)' for the particular solution, where B is an undetermined coefficient. 

Substituting the trial solution and its corresponding versions for periods (t+ 1) and 
(t + 2} into the given difference equation, we obtain 


BOY 4 BY! - 3B(7)' = 7) or (74? 47-307 = 7 
Thus, 
g-—t _.1 
4947-3 53 


and we can write the particular solution as 
1 
= B= 507) 
Yp= BUY = (7) 


This may be taken as a moving equilibrium. You can verify the correctness of the solution 
by substituting it into the difference equation and seeing to it that there will result an iden- 
tity, 7° = 7! 


The result reached in Example 1 can be easily generalized from the variable term 7! to 
that of cm‘. From our experience, we expect all the successive differences ofcm! to take the 


Example 2 


Chapter 18 Higher-Order Difference Equations 587 


same form of expression: namely, Bm‘, where B is some multiplicative constant. Hence we 
can try a solution y, = Bm! for the particular solution, when given the difference equation 


Vent ayer + aay, = em! (18.35) 


Using the trial solution y, = Bm‘, which implies y,4, = Bm’*!, etc., we can rewrite equa- 
tion (18.35) as 


Bal? +a,Bm't! +a Bm' =em! 
or Bom? + apm t+an)m' = em! 


Hence the coefficient B in the trial solution should be 
c 


ee pam +0 
and the desired particular solution of (18,35) can be written as 


c 


Yp = Bn! = m! (m? taym+a, £0) (18.36) 


m? bam +a, 
Note that the denominator of 8 is not allowed to be zero. If it happens to be," we must 
then use the trial solution y, = &tm! instead; or, if that too fails, y, = Btm'. 


Variable Term in the Form of ct” 


Let us now consider variable terms in the form cf", where c is any constant, and n is a 
positive integer. 


Find the particular solution of 
yo2 t Symi tay ae? 


The first three differences of t (a special case of ct? with c= 1 and n= 2) are found as 
foliows:* 


Af =(t+1P-P? =2t+1 

a4? = A(AP) = A(2t4 1) = A2t4+ Al 
=At+1)-2t4+052 [A constant = 0] 

BH = a(a7) = A2=0 


Since further differencing will only yield zero, there are altogether three distinct types of 
expression: (? (from the variable term itself), t, and a constant (from the successive 
differences). 

Let us therefore try the solution 


Yr = Bo + Bit + Bat? 


* Analogous to the situation in Example 3 of Sec. 16.6, this eventuality will materialize when the 
constant m happens to be equal to a characteristic root of the difference equation. The characteristic 
roots of the difference equation of (18.35) are the values of b that satisfy the equation b2 + ab + 
@ = 0. If one root happens to have the value m, then it must follow that m? + a1m+a =0. 
+ These results should be carmpared with the first three derivatives of t?: 

dy ¢, e 

yit=2t  —5t*=2 and 


pte 
dt ate we ° 


588 Part Five Dynamic Anatysiv 


for the particular solution, with undetermined coefficients Bo, 81, and 8». Note that this 
solution implies 


Yer = Bo + Bi(t+1) + Bot +17 
= (Bo + 8) + By) + (Br +2B2)e + B2t? 
vera = Bo + Bi(t-+ 2) + B2(t+ 2)? 
= (Bo +28; + 482) + (By + 4B2)f + 8,0? 
When these are substituted into the difference equation, we obtain 
(B8By + 7By +982) + (88) + 14B2)t + BBat? = 7 
Equating the two sides term by term, we see that the undetermined coefficients are 
required to satisfy the following simultaneous equations: 


BBy+7B; + 982 =0 


8B, + 1482 =0 
88) =1 
Thus, their values must be Bo = 74, 81=—%, and B> = g, giving us the particular 
solution 
BT 


Ye 356 32° * 8 
Our experience with the variable term @ should enable us to generalize the method to 
the case of ct”. In the new trial solution, there should obviously be a term B,t”, to corre- 
spond to the given variable term. Furthermore, since successive differencing of the term 
yields the distinct expressions t”-1, t”-?,..., , and Bo (constant), the new trial solution for 
the case of the variable term cf” should be written as 


yr= Bot Bitt Bot? +--+ Bat” 
But the rest of the procedure is entirely the same. 

It must be added that such a trial salution may also fail to work. In that event, the trick— 
already employed on countless other occasions—is again to multiply the original trial 
solution by a sufficiently high power of t. That is, we can instead try yr = t(8o + Bf + 
Bot? +--+ + Bnf®), etc, 


Higher-Order Linear Difference Equations 
The order of a difference equation indicates the highest-order difference present in the 
equation; but it also indicates the maximum number of periods of time lag involved. An 
nth-order linear difference equation (with constant coefficients and constant term) may 
thus be written in general as 

Pen FA Vege Lb tan Wert + dnd = € (18.37) 

The method of finding the particular solution of this does not differ in any substantive 
way. Asa starter, we can still try y, = & (the case of stationary intertemporal equilibrium). 
Should this fail, we then try y, = kf or y = Av’, etc., in that order. 

Inthe search for the complementary function, howevet, we shall now be confronted with 
a characteristic equation which is an nth-degree polynomial equation: 


Bab +. +ay bt ae = 0 (18.38) 





Example 3 


Chapter 18 Higher-Order Difference Lquationy 589 


There will now be # characteristic roots 6; (i = 1, 2,..., 2), all of which should enter into 
the complementary function thus: 


yo DAB (18.39) 
isl 


provided, of course, that the roots are all real and distinct. In case there are repeated real roots 
(say, 6; = 6; = 63), then the first three terms in the sum in (18.39) must be modified to 


ALB + Agth) + Agt?b! [ef (18.6)] 


Moreover, if there is a pair of conjugate complex roots-- say, 6,1, b»-—then the last two 
(erms in the sum in (18,39) are to be combined into the expression 
R'(A,_\ cos@t + A, sin Br) 

A similar expression can also be assigned to any other pair of complex roots. In case of two 
repeated paits, howevet, onc of the two must be given a multiplicative factor of #R! mstead 
of R. 

After yy and y, are both found, the general solution of the complete difference equation 
(18.37) is again obtained by summing; that is, 


Ye = Yp te 


But since there will be a total of arbitrary constants in this solution, no less than # initial 
conditions will be required to definitize it. 
Find the general solution of the third-order difference equation 
7 1 1 
Yi — gi + gu + Bria =9 

By trying the solution y, = k, the particufar solution is easily found to be yp = 32. As for the 
complementary function, since the cubic characteristic equation 

7 1 1 

bb 4 b+ = 

3 b+ 3 b+ 32 0 

can be factored into the form 


Neale) 


the roots are by = b2 = 7 and bs = —4. This enables us to write 


1\' 1\' iy 
yaa (3) eaar(S)' 24s(-2) 


Note that the second term contains a multiplicative t; this is due to the presence of repeated 
roots. The general solution of the given difference equation is then simply the sum of y, and yp. 

In this example, all three characteristic roots happen to be less than 1 in their absolute 
values. We can therefore conclude that the solution obtained represents a time path which 
converges to the stationary equilibrium level 32, 


Convergence and the Schur Theorem 

When we have a high-order difference cquation that is not easily solved, we can nonethe- 
less determine the convergence of the relevant time path qualitatively without having to 
struggle with its actual quantitative solution. You will recall that the time path can converge 
if and only if every root of the characteristic equation is less than | in absolute value. 


590 Part Five Dynamic Analysis 


Example 4 


In view of this, the following theorem—known as the Schur theorem’—becomes dircolly 


applicable: 


The roots of the ath-degree polynomial equation 


anh? tayb" | oe 


+a,-[b+a,=0 


will ail be less than unity in absolute value if and only if the following # determinants 





arc all positive, 





Note that, since the condition in the theorem is given on the “if and only if” basis, it is 


a hecessary-and-sufficient condition, Thus 
cquation counterpart of the Routh theorem in 
framework. 

The construction of these determinants is 
explained with the aid of the dashed lines whic! 


e Schur theorem is a perfect difference- 
troduced earlier in the differential-equation 


based on a simple procedure. This is best 
h partition each determinant into four areas. 


Each area of the &th determinant, A;, always consists of a k x & subdeterminant. The 
upper-left area has dg alone in the diagonal, zeros above the diagonal. and progressively 
larget subscripts for the successive coefficients in each column below the diagonal cle- 
ments. When we transpose the elements of the upper-left area, we obtain the lower-right 
area. Turning to the upper-right area, we now place the a, cocflicient alone in the diagonal. 
with zeros below the diagonal, and progressively smaller subscripts for the successive 


coefficients as we go up each column [rom the 
transposed, we get the lower-left area. 





agonal. When the elements of this area are 


The application of this theorem is straightforward. Since the coefficients of the charac- 
teristic equation are the same as those appearing on the left side of the orginal difference 
equation, we can introduce them directly into the determinants cited, Note that, in our 


context, we always have a) = 1. 
Does the time path of the equation y-2 + 3¥241 
and the coefficients are ag = 1, a = 3, and a2 


_|1 
12 


mm a 


A= 
az ay 








t Fora discussion of this theorem and its history, see 


+ 2y, = 12 converge? Here we have n = 2, 
= 2. Thus we get 


|--3<0 


John 5. Chipman, The Theory of inter-Sectoral 


Money Hows and Income Formation, The Jahns Hopkins Press, Baltimore, 1951, pp. 119-120 


Chapter 18 Higher-Order Difference: Eyuetions 591 


Since this already violates the convergence condition, there is no need to proceed to Az. 
Actually, the characteristic roots of the given difference equation are easily found to be 
b, 22 = —1, —2, which indeed imply a divergent time path. 














Example 5 Test the convergence of the ath of yi2-+ g¥.1 3% = 2 by the Schur theorem. Here the 
——~— coefficients are do = 1, = 3, =~} (with = 2). Thus we have 
1% 1-2 1) 367 
1 1 
m 0 H i Ome 6 
apal|t @ 0 @)_ 3 1 0h) 1,176- 
“la QO a al -3 oo1 We 7,296 7 
a a 0 a 
(Ene 9 | 
These do satisfy the necessary-and-sufficient condition for convergence. 
EXERCISE 18.4 


¥, Apply the definition of the “differencing” symbal A, to find: 
(a) At (b) Azt (9 ab 
Compare the results of differencing with those of differentiation 
2, Find the particular solution of each of the following: 
) yu2t2nr tha 3 
(b) yaa — S41 — 6p = 266)" 
(0) 322 +9y = 34)! 
3. Find the particular solutions of: 
(@) Vira — 2yn1 t Syst 
(0) Vira — 21 + Sy 4+ DE 
(O Vesa + Speer + 2yp = 18 + 6t+ 8? 
4, Would you expect that, when the variable term takes the form m'+¢*, the trial 
solution should be B(m)' + (Bo + Bit +---+ Bt")? Why? 
5. Find the characteristic roots and . complementary function of: 
(@) yun — $y — Yet + m= 
() Yaa ~ yet dyes — Fy =7 
[Hint: Try factoring out (b— 1) in both characteristic equations.] 
6. Test the convergence of the solutions of the following difference equations by the 
Schur theorem: 
(2) Yer2 + py -dn53 
( yar - gyn 
7. In the case of a third-order difference equation 
Yas +O yur t+@eyi + ay =o 
what are the exact forms of the determinants required by the Schur theorem? 


Chapter 


19.1 The Genesis of Dynamic Systems 






































Simultaneous Differential 
Equations and Difference 
Equations 


Heretofore, our discussion of economic dynamics has been confined to the analysis of a 
single dynamic (differential or difference) equation, In the present chapter, methods for 
analyzing a system of simultaneous dynamic equations are introduced. Because this would 
entail the handling of scveral variables ai the same time, you might anticipate a great deal 
of new complications. But the truth is that much of what we have already learned about 
single dynamic cquations can be readily extended to systems of simultaneous dynamic 
equations, For instance, the solution of a dynamic system would still consist of a set of 
particular integrals or particular solutions {intertemporal equilibrium values of the various 
variables) and complementary functions (deviations from cquilibriums). The cormplemen- 
tary functions would still be based on the reduced equations, i.c., the homogeneous versions 
of the equations in the system. And the dynamic stability of the system would still depend 
on the signs (if diflerential equation system) or the absolute values (if difference equation 
system) of the characteristic roots in the complementary functions. Thus the problem of a 
dynamic system is only slightly more complicated than that of a single dynamic cquation. 





592 


There are two general ways in which a dynamic system can come into being. [t may em- 
anate from a given se? of interacting patterns of change. Or it may be derived from a single 
given paticrn of change, provided the latter consists of a dynamic equation of the sccond 
(or higher) order. 


Interacting Patterns of Change 

The most obvious case of'a given sct of interacting patterns of change is that of a mullisec- 
tor model where each sector, as described by a dynamic equation, impinges on at Icast one 
of the other sectors. A dynamic version of the input-output model, for cxample, could in- 
volve n industries whose output changes produce dynamic repercussions on the other in- 
dustries. Thus it constitutes a dynamic system. Similarly, a dynamic general-equilibrium 





Chapter 19 Simultaneous Differential Equations and Difference Equations 593 


market model would involve » commodities that are interrelated in their price adjustments, 
Thus, there is again a dynamic system. 

However, interacting patterns of change can be found even in a single-sector model, The 
various variables in such a model represent, not different sectors or different commodities, 
but different aspects of an economy. Nonetheless, they can affect one another in their 
dynamic behavior, so as to provide a network of interactions,’ A concrete example of this has 
in fact been encountered in Chap. 18, In the inflation-unemployment model, the expected 
rate of inflation x follows a pattern of change, (18.19), that depends not only on zz, but also 
on the rate of unemployment U (through the actual rate of inflation 7). Reciprocally, the 
pattern of change of U, (18.20), is dependent on zr (again through p). Thus the dynamics 
of x and U must be simultaneously determined. In retrospect, therefore, the inflation- 
unemployment model could have been treated as a simultaneous-equation dynamic model. 
And that would have obviated the long sequence of substitutions and eliminations that were 
undertaken to condense the model into a single equation in one variable. Below, in Sec. 19.4, 
we shall indeed rework that model, viewed as a dynamic system. Meanwhile, the notion that 
the same model can be analyzed either as a single equation or as an equation system supplies 
a natural cue to the discussion of the second way to have a dynamic system. 





The Transformation of a High-Order Dynamic Equation 
Suppose that we are given an nth-order differential (or difference) equation in one variable. 
Then, as will be shown, it is always possible to transform that equation into a mathemati- 
cally equivalent system of 1 simultaneous first-order differential (or difference) equations 
in # variables. In particular, a second-order differential equation can be rewritten as two 
simultaneous first-order differential equations in two variables,’ Thus, even if we happen to 
start out with only one (high-order) dynamic equation, a dynamic system can nevertheless 
be derived through the artifice of mathematical transformation. This fact, incidentally, has 
an important implication: In the ensuing discussion of dynamic systems, we need only be 
concerned with systems of first-order equations, for if a higher-order equation is present, 
we can always transform it first into a set of first-order equations. This will result in a larger 
number of equations in the system, but the order will then be lowered to the minimum. 

To illustrate the transformation procedure, let us consider the single difference cquation 


Vere baryigi + anys = € (19,1) 
If we concoct an artificial new variable x;, defined by 
mS Yer Cimplying x21 = yr42) 
we can then express the original second-order equation by means of two first-order (one- 
period lag) simultaneous equations as follows: 
Xe + ax; + oy, Se 


(19.1° 
Mar =0 ) 


+ Note that if we have two dynamic equations in the two variables y, and y» such that the pattern 
of change of jy depends exclusively on jj itself, and similarly for yz, we really do not have a 
simultaneous-equation system. Instead, we have merely two separate dynamic equations, each of 
which can be analyzed by itself, with no requirement of “simultaneity,” 

¥ Conversely, two first-order differential (or difference) equations in two variables can be consolidated 
into a single second-order equation in one variable, as we did in Secs. 16.5 and 18,3. 


594 Part Five Dynasiie Analysis 


It is easil 
fied, the fi 









seen that, as long as the second equation (which defines the variable x;) is satis- 
s identical with the original given equation. By a similar procedure, and using 


more artificial variables, we can similarly transform a higher-order single equation into an 
equivalent system of simultaneous first-order equations. You can verify, for instance, that 


the third-order equation 


Moat 2 — 341 +2, = 0 (19.2) 
can be expressed as 
Wept + w, —3x, +2y, =0 
XH — Hy =0 (19.2') 
Jett — x =0 
where x, = 41 (so that x2) = y42) and ; S x41 (so that wy) = Xy42 = Yy43)- 


By a perfectly similar procedure, we can also transform an nth-order differential equa- 
tion into a system of n first-order cquations. Given the second-order differential equation 


vil) tay) + aye) =0 
for instance, we can introduce a new variable x{r), defined by 


xQ=y'Q) [implying x’) = yO] 


(19.3) 


Then (19.3) can be rewritten as the following system of two first-order cquations: 


x(a) + ayx(t) + y(t) = 0 
yO- xt) =0 


(19.3 


where, you may note, the second equation performs the function of defining the newly in- 
troduced x variable, as did the second equation in (19.1’). Essentially the same procedure 
can also be used to transform a higher-order differential equation. The only modification is 


that we must introduce a correspondingly larger number of new variables. 


19.2 Solving Simultaneous Dynamic Equations 





The methods for solving simultaneous differential equations and simultancous difference 
equations ate quite similar. We shall thus discuss them together in this section. For our pre- 
sent purposes, we shall confine the discussion to linear equations with constant coefficients 


only. 


Simultaneous Difference Equations 


Suppose that we are given the following system of linear difference equations: 


Xt + 6x, +9y, = 
eel & =0 


(19.4) 


How do we find the time paths of x and y such that both equations in this system will be sat- 
isfied? Essentially, our task is again to scck the particular integrals and complementary 
functions, and sum these to obtain the desired time paths of the two variables. 


Chapter 19 Sinultancous Differential Equations and Difference Equations 595 


Since particular integrals represent intertemporal equilibrium values, jet us denote them 
by x and y. As before, it is advisable first to try constant solutions, namely, x,_; = x, =X 
and y,~| = y; = ¥. This will indeed work in the present case, for upon substituting these 
trial solutions into (19.4) we get 





-_ _ | 
> 2 Faq (19.5) 








(In case such constant solutions fail to work, however, we must then try solutions of the 
form x, = Ait, yy) = kof, ete.) 

For the complementary functions, we should, drawing on our previous experience, adopt 
trial solutions of the form 





x=mb and oy, Sab! (19.6) 


where m and x are arbitrary constants and the base 4 represents the characteristic root. It is 
then automatically implied that 


x) = mbit! and Yq, = nit! (19.7) 


Note that, to simplify matters, we are employing the same base 6 ¢ 0 for both variables, 
although their coefficients are allowed to differ. it is our aim to find the values of &, m. and 
n that can make the trial solutions (19.6) satisfy the reduced (homogencous) version 
of (19.4). 

Upon substituting the trial solutions into the reduced version of (19.4) and canceling the 
common factor bt # 0, we obtain the two equations 


(6+ 6)m + 9n =0 


—m+bn =0 (19.8) 


This can be considered as a linear homogeneous-equation system in the two variables m 
and a—if we are willing to consider 6 as a parameter for the time being. Because the sys- 
tem (19.8) is homogeneous, it can yield only the trivial solution a =” = 0 if its coelficient 
matrix 1s nonsingular (see Table 3,1 in Sec, 5,5), In that event, the complementary functions 
in (19.6) will both be identically zero, signifying that x and y never deviate from their in- 
tertemporal equilibrium values. Since that would be an uninteresting special case, we shall 
try to rule out that trivial solution by requiring the coefficient matrix of the system to be 
singular. That is, we shall require the detcrminant of thgt matrix to vanish: 


Ps° 9 


a p[ =e e+ 9=0 (19,9) 


From this quadratic equation, we find that b(= by = by) = —3 is the only value which can 
prevent m and n from both being zero in (19.8), We shall therefore only use this value of 5. 
Equation (19.9) is called the characteristic equation, and its roots the characteristic roots, 
of the given simultaneous difference-cquation system. 

Once we have a specific value of b, (19.8) gives us the corresponding solution values of 
m and n. The system being homogeneous, however, there will actually cmerge an infinite 
number of solutions for (m, 7), expressible in the form of an equation m = kn, where kisa 
constant. In fact, for each root 4;, there will in general be a distinct equation m; = kn; 
Even with repeated roots, with b; = 62, we should still use two such cquations, m1) = kn) 





596 Part Five Dynamic Analysis 


and m = kyn2 in the complementary functions. Moreover, with repeated roots, we recall 
from (18.6) that the complementary functions should be written as 
x, = my(—3)' + mgt(-3) 
ye = m(-3) + mat(-3 
The factors of proportionality between 2; and n; must, of course, satisfy the given equa- 
tion system (19.4), which mandates that v,.) = 2x;, Le., 
y(—3)4! + g(t + WY 3)T = ay (=3)! + at (3)! 
Dividing through by (—3)‘, we get 
—34, —3a{t +1) =m) +o 
or, after rearranging, 
—3(n, +42) —3aot — my +t 
Equating the terms with ¢ on the two sides of the equals sign, and similarly for the terms 
without ¢, we find 
m, = —3(m) +72) and My = —3m 
If we now write 7) = 43,22 = A4, then it follows that 
m, = —3(43 + Aq) my = -3Ay 


Thus the complementary functions can be written as 


Ky = —3(A3 + Aa(—3)!" — 3.4 gt(-3Y 
= —34;(-3)' — 3Aa(t + Y(-3Y (19.10) 
Ye = Ax(—3)' + Agt(—3)' 


where 43 and A, ate arbitrary constants. Then the gencral solution follows easily by com- 
bining the particular solutions in (19.5) with the complementary functions just found, All 
that remains, then, is to definitize the two arbitrary constants 43 and 44 with the help of 
appropriate initial or boundary conditions. 

One significant feature of the preceding solution is that, since both time paths have iden- 
tical b’ expressions in them, they must either both converge or both diverge. This makes 
sense because, in a model with dynamically interdependent variables, a general intertem- 
poral equilibrium cannot prevail unless no dynamic motion is prescnt anywhere in the 
system. In the present case, with repeated roots ) = —3, the time paths of both x and y will 
display explosive oscillation, 


Matrix Notation 

In order to bring out the basic parallelism between the methods of solving a single equation 
and an equation system, the preceding exposition was carried out without the benefit of ma- 
trix notation. Let us now sce how the latter can be utilized here. Even though it may seem 
pointless to apply matrix notation to a simple system of only two equations, the possibility 
of extending that notation to the n-equation case should make it a worthwhile exercise. 


Chapter 19 Simulianeous Differential Equations and Difference Equations 597 


First of all, the given system (19.4) may be expressed as 
107] x41 6 Ol)x,|_ [4 fs 
bbs Ls olilb) 


lu+Ku=d (19.4”) 


of, more succinctly, as 


where J is the 2 x 2 identity matrix; X is the 2 x 2 matrix of the coefficients of the x, and 
yy terms; and x, v, and d are column vectors defined as follows:! 


“[e} eB [a] 


The reader may find one feature puzzling: Since we know /u = u, why not drop the /? The 
answer is that, even though it seems redundant now, the identity matrix will be needed in 
subsequent operations, and therefore we shall retain it as in (19.4"). 

When we try constant solutions x,,; = 1, =X and y,4) =); = ¥ for the particular 


solutions, we are in effect setting w =u = [F ; this will reduce (19.4") to 
(+R) I] =d 
? 
If the inverse (7 + K) | exists, we can express the particular solutions as 
x -1 z 
; =(/+Kky'd (19.55 


This is of course a general formula, for it is valid for any matrix K and vector d as long as 
(1+)! exists. Applied to our numerical example, we have 
Therefore, 3 


-1 t 9 
a, | 7 9 4)_ |i wel 4] _ 
(+k) a=[ 7 i} [of=fa 2 ]lof]= 
16 16 
Turning to the complementary functions, we sce that the trial solutions (19.6} and (19.7) 


= 4 which checks with (19.5). 
give the w and v vectors the specific forms 


tt] io 
ue [we | = ["]e and oe me |= [| 


When substituted into the reduced equation /v + K v = 0, these trial solutions will trans- 


form the latter into 
I yee +x[" |e =0 
n A 


* The symbol v here denotes a vector. Do not confuse it with the vin the complex-number notation 
A+ vi, where it represents a scalar. 


BIS Bi 





598 Part Five Dynamic Analysis 


or, after multiplying through by > ‘ (a scalar) and factoring, 
orm) ["]=0 (19.8’) 


where 0) is a zero vector. It is from this homogencous-equation system that we are to find 
the appropriate valucs of b, m, and a to be uscd in the trial solutions in order to make the 
latter determinate. 

To avoid trivial solutions for # and n, it is necessary that 


jb +K/=0 (19.9) 


And this is the characteristic equation which will give us the characteristic roots 6;. You can 
verify that if we substitute 


b 0 -_[ 69 
n=[h | and «-[ 3 i] 
into this equation, the result will preciscly be (19.9), yielding the repeated roots & = —3. 
In general, each root 6; will elicit from (19.8') a particular sct of infinite number of 


solution values of m# and a which are tied to each other by the equation m; = kin;. It is 
therefore possible to write, for each value of &;, 


ny = Aj and om = AA; 


where 4; are arbitrary constants to be definitized later. When substituted into the trial so- 
lutions, these expressions for 2; and m; along with the values 4; will lead to specific forms 
of complementary functions. If all roots are distinet real numbers, we may apply (18.5) and 


write 
Xe |_| Emibp |_| Ek Abi 
ve | | abi ff Aa 


With repeated roots, however, we must apply (18.6) instead and, as a result, the comple- 
mentary functions will contain terms with an extra multiplicative ¢, such as 15! + mtb! 
(for x.) and 2,6" + 26! (for v,). The factors of proportionality between m; and n; are to 
be determined by the relationship between the variables x and y as stipulated in (he given 
equation system, as illustrated in (19.10) in our numerical example. Finally, in the 
complex-root case, the complementary functions should be written with (18.10) as their 
prototype. 
Finally, to get the general solution, we can simply form the sum 


Beeoe 


Then it remains only to definitize the arbitrary constants 4;. 

‘The extension of this procedure to the n-cquation system should be self-evident. When 
vis large, however, the characteristic cquation—an nth-degree polynomial equation—may 
not be easy to solve quantitatively. In that event, we may again find the Schur theorem to be 
of help in yielding certain qualitative conclusions about the time paths of the variables in 





Chapter 19 Simultaneous Difforential Equations and Difference Equations 599 


the system. All these variables, we recall, are assigned the same base b in the trial solutions, 
so they must end up with the same d/ expressions in the complementary functions and 
share the same convergence properties. Thus a single application of the Schur theorem will 
enable us to determine the convergence or divergence of the time path of every variable in 
the system. 


Simultaneous Differential Equations 

The method of solution just described can also be applied to a first-order linear differential- 
equation system. About the only major modification needed is to change the trial solu- 
tions to 


rt 


x(t) = me and =—y(t) = ne” (19.11) 
which imply that 
x(t)=rme" and —y'(1) = ne" (19,12) 
In line with our notational convention, the characteristic roots are now denoted by r instead 
oppose that we are given the following equation system: 
x(t + 2y'() + 2x(1) + Sy(Q) = 77 
yt) + x(t) +470) = 61 


First, let us rewrite it in matrix notation as 


Jut+Mv=g (19.13’) 
where the matrices are 


_f{i 2 _ fx {2 5 _ | x) _|77 
s=[4 i 0] uli ‘| = [i] [4] 


Note, that, in view of the appearance of the 2)’(¢) term in the first equation of (19.13), we 
have to use the matrix J in place of the identity matrix /, as in (19.4”). Of course, if/ is non- 
singular (so that /~! exists), then we can in a sense normalize (19.13’) by premultiplying 
every term therein by J~!, to get 
J Jut sy Mvss"g or fut Ku=d 
(KeJ'MideJ'g) (19.13”) 

This new format is an exact duplicate of (19.4"), although it must be remembered that the 
vectors u and v have altogether different meanings in the two different contexts. In the en- 
suing development, we shall adhere to the Ju + Mv = g formulation given in (19.13’). 

To find the particular integrals, let us try constant solutions x(#} = x and y(t) = y— 
which imply that x'(¢) = y(t) = 0. If these solutions hold, the vectors v and w will become 


(19.13) 


v= ; and 4 = ° , and (19.13") will reduce to Mfv = g. Thus the solution for ¥ 


and y can be written as 


[ ]-teur's (19.14) 


600 Part Five Dynamic Analysis 


which you should compare with (19.5’). In numerical terms, our present problem yields the 
following particular integrals: 


GIT E-(3 TL] 


Next, let us look for the complementary functions. Using the trial solutions suggested in 
(19,11) and (19.12), the vectors w and v become 


w= [ re and e [ter 
a a 


Substitution of these into the reduced equation 


Ju+QMv=0 


[tre eae] ero 
n n 


ot, after multiplying through by the scalar e“”' and factoring, 


yields the result 


(rd +) "| =0 (19.15) 


You should compare this with (19.8'). Since our objective is to find nontrivial solutions of 
mand n (so that our trial solutions will also be nontrivial), it is necessary that 


ri +M|=0 (19.16) 


The analog of (19.9’), this last equation—the characteristic equation of the given equation 
system—-will yield the roots r; that we need. Then, we can find the corresponding (non- 
trivial) values of m; and n;. 

In our present example, the characteristic equation is 








_jrt+2 2rt+3}_ oa _ , 
el + Ml = i naa] =e +4°43=0 (19.16') 
with roots #, = —-1, 7) = —3. Substituting these into (19.15), we get 


1 lim ]-e (for r) = —1) 
-1 =1)] fm 
[ 1 iim |= (forrz = —3)} 


It follows that #7, = —3) and mz = —n2, which we may also express as 
my) = 34, and My = Ay 
AL =A ny = Al 


Now that ;, m;, and #; have all been found, the complementary functions can be writ- 
ten as the following linear combinations of exponential expressions: 


wnt 
[| = [re | [distinct real roots] 
tc i 


Chapter 19 Simultaneous Differential Equations and Difference Equations 601 


And the general solution will emerge in the form 


[eol-L 


in our present example, the solution is 

x(t)] _[ 34;e7% + Ae * + 1 

y(t) | | Aye — Age + 15 
Morcover, if we are given the initial conditions x(0) = 6 and ¥(0) = 12, the arbitrary con- 
stants can be found to be 4; = 1 and Ag = 2, These will serve to definitize the preceding 
solution. 

Once more we may observe that, since the e”* expressions are shared by both time paths 
x(t) and y(7), the latter must cither both converge or both diverge, The roots being —1 and 
—3 in the present case, both time paths converge to their respective equilibria, namely, 
land y = 15. 

Even though our example consists of a two-equation system only, the method certainly 
extends lo the general 7-cqualion system. When 7 is large, quantitative solutions may again 


be difficult, but once the characteristic equation is found, a qualitative analysis will always 
be possible by resorting to the Routh theorem. 





Further Comments on the Characteristic Equation 
The term “charactcristic equation” has now been encountered in three scparate contexts: In 
Sec. 11.3, we spoke of the characteristic equation of a matrix; in Secs. 16.1 and 18.1, the 
term was applicd to a single linear differential equation and difference equation, now, 
in this section, we have just introduced the characteristic equation of a system of Hincar 
difference or differential equations. Is there a connection between the three’ 
There indeed is, and the connection is a close one. In the first place, given a single 
equation and an cquivalent cquation system as cxemplificd by the equation (19.1) and 
the system (19.1'), or the equation (19.3) and the system (19.3') - their characteristic 
equations must be identical. For illustration, consider the difference equation (19.1), 
Veta + aiyr4i + aay = c. We have earlier learned to write its characteristic equation by 
directly transplanting its constant coefficients into a quadratic equation: 












b+ abta =0 


What about the equivalent system (19.1')? Taking that system to be in the form of 
ay a 


Tu + Kv =d, as in (19.4"), we have the matrix K = [Ss 0 


. So the characteristic 


equation is 


reai= [Poe <P tartar =o [by (19.9)] (19.17) 


which is precisely the same as the one obtained from the single equation as was asserted. 
Naturally, the same type of result holds also in the differential-equation framework, the 
only difference being that we would, in accordance with our convention, replace the symbol 
bby the symbol r in the latter framework. 


602 Part Five Dynastic Analysis 


It is also possible to link the characteristic equation of a difference- (or difterential-) 
equation system to that ofa particular square matrix, which we shall call /. Referring to 
the definition in (11.14), but using the symbol 4 (instead of ) for the difference-cquation 
framework, we can write the characteristic equation of matrix D as follows: 


|D—bi|/=0 (19.18) 


In general, if we multiply every element of the determinant |/) — 4/| by —1, the value of 
the determinant will be unchanged if matrix D contains an even number of rows (or 
columns), and will change its signif B contains an odd number of rows. In the present case, 
however, since |/) — /| is to be set equal to zero, multiplying every element by —1 will 
not matter, regardless of the dimension of matrix /). But to multiply every element of the 
determinant |D — 57 by —1 is tantamount to multiplying the matrix (2 — /) by -I (sce 
Example 6 of Sec, 5,3) before taking its determinant. Thus, (19.18) can be rewritten as 


|jbf —D| =0 (19.18') 


When this is equated to (19.17), it becomes clear that if we pick the matrix 2 = ~K,, then 
its characteristic equation will be identical with that of the system (19.1'), This matrix, 
—K, has a special meaning: If we take the reduced version of the system, /u + Kv =0, 
and express it in the form of Ju = —K v, or simply u = —K v, we see that —& is the matrix 
that can wansform the vector v = [i into the vector u = [i 
equation. ue ve 

Again, the same reasoning can be adapted to the differential-equation system (19.3'). 
However, in the case of'a system such as (19.13), Ju + Mv = g, where-—-unlike tn the sys- 
tom (19.3’}—the first term isu rather than /u, the characteristic equation is in the form 


J+ M|s0 [ef (19.169) 


|" that particular 


For this case. if we wish to find the expression for the matrix 2), we must first normalize the 
equation Ju + Mu = g into the form of (19.13), and thon take D = —K = —/ MM, 

In sum, given (1) a single difference or differential equation. and (2) an equivalent equa- 
tion system, from which we can also obtain (3) an appropriate matrix D. if we try to find the 
characteristic equations of all three of these, the results must be one and the same. 





EXERCISE 19.2 


1. Verify that the difference-equation system (19.4) is equivalent to the single equation 
Your + 6Ye41 + 9: = 4, which was solved earlier as Example 4 in Sec. 18.1. How do the 
solutions obtained by the two different methods compare? 

2. Show that the characteristic equation of the difference equation (19.2) is identical with 
that of the equivalent system (19.2'). 

3. Solve the following two difference-equation systems: 

(@) x1 + + 2y = 24 
Yar +2x,-2h= 9 (with x9 = 10 and yp = 9) 


(8) x41 aXe ty =-1 
Xtal Yer BY = BS (with xp = Sand yo = 4) 


Chapter 19 Simultaneous Differential Equations and Difference Equations 603 


. Solve the following two differential-equation systems: 
(a) x() —-x{) - 12y(t) = -60 
y(O+ xO+ 6y(Q= 36 — [with x(0) = 13 and y(0) = 4] 
@® xO - 2x) + 3y()= 10 
y(Q— x()+ 2y(Q= 9 — [with x(0)=8and y(0) = 5] 


. On the basis of the differential-equation system (19.13), find the matrix D whose char- 
acteristic. equation is identical with that of the system. Check that the characteristic 


equations of the two are indeed the same. 


19.3 Dynamic Input-Output Models 





Our first cncounter with input-output analysis was concerned with the question: How much 
should be produced in cach industry so that the input requirements of all industries, as well 
as the final demand (open system), will be exactly satisfied? The context was static, and the 
problem was to solve a simultancous-cquation system for the equilibrium output levels of 
all industries. When cerlain additional cconomie considerations are incorporated into the 


model, the inp 
a difference- o 

Three such 
simple, howev 
since we shall 


prove difficult, 


ut-output system can take on a dynamic character, and there will then result 
r differential-equation system of the type discussed in Sec. 19.2. 

dynamizing considerations will be considered here. To keep the exposition 
er, we shall illustrate with (wo-industry open systems only. Nevertheless, 
employ matrix notation, (he generalization to the n-industry case should not 
for it can be accomplished simply by duly changing the dimensions of the 


matrices involved. For purposes of such generalization, it will prove advisable to denote the 


variables not 
when needed, 
(measured in 


y x, and », but by x), and x2,,, so that we can extend the notation to x, 
You will recall that, in the input-output context, x; represents the output 
ollars) of the ith industry; the new subscript ¢ will now add a time dimension 





to it. The input-coefficient symbol 4;; will still mean the dollar worth of the ith commodity 


required in the 
cate the final 


Time Lag i 


production of a dollar's worth of the /th commodity, and ¢, will again indi- 
lemand for the ‘th commodity. 


in Production 


In a static two-industry open system, the output of industry [ should be set at the level of 


demand as fol 


OWS. 


41 aX) + ayax2 +d 


Now assume a one-period lag in production, so that the amount demanded in period / de- 





termines not tl 


e current output but the output of period (¢ + 1). To depict this new situa- 


tion, we must modify the preceding equation to the form 


Mrs = Quad + ayote, + dh, (19,19) 


Similarly, we can write for industry 1]: 


X24) = Are + anrr, + dy, (19.19’) 


604 Part Five Dynamic Analysis 


Example 1 


Thus we now have a system of simultancous difference equations; this constitutes a 
dynamic version of the input-output model. 
Tn matrix notation, the system consists of the equation 


X41 — Ax, = 4; 


Xe x ay a di 
where xy) = wel x=] ol Aa| Ol SP d= ha 
X21 Xay ay) Gap ths 


Clearly, (19.20) is in the form of (19.4, with only two exceptions. First, unlike vector u, 
vector x4; does not have an identity matrix / as its “coefficient.” However, as explained 
earlier, this really makes no analytical difference. The second, and more substantive, point 
is that the vector d,, with a time subscript, implies that the final-demand vector is being 
viewed asa function of time. Tf this function is nonconstant, a modification will be required 
in the method of finding the particular solutions, although the complementary functions 
will remain unaffected. The following example will illustrate the modified procedure. 


(19.20) 


Given the exponential final-demand vector 
at Wat wa 
=i.) =], |8 =a positive scalar) 
find the particular solutions of the dynamic input-output model (19.20). In line with the 


method of undetermined coefficients intraduced in Sec. 18.4, we should try solutions of the 
form a1,¢ = pid! and x24 = f25', where #; and f2 are undetermined coefficients. That is, we 


should try 
_ | Bd] _ [Bi] oe 
n= [Be] = [ft 5 (19.21) 
which impliest 
PBST) _ [bs] fF 8 OPP Ar] ge 
rt =) ast |= La |= 0 6 flee |? 
If the indicated trial solutions hold, then the system (19.20) will become 
8 07)[ A] fon mele |e 1) st 
oe v= 8 
[2 5 [i an a2 | | B2 | 1 
or, on canceling the common scalar multiplier 4¢ # 0, 
8-a —o ][i]_[1 
19.22) 
—a21 malls {1 ( ) 
t You will note that the vector ik | can be rewritten in several equivalent forms: 
2 
Bt A 1 O7F A} _ 78 OFFA 
[ale Tal «Lo lnd=lo olla] 








60 
We choose the third alternative here because in a subsequent step we shall want to add [: 3 | to 


another 2 x 2 matrix. The first two allernative forms will entail problems of dimension conformability. 


Chapter 19 Simultaneous Differential Equations and Difference Equations 605 


Assuming the coefficient matrix on the extreme left to be nonsingular, we can readily find 
fy and Bz (by Cramer's rule) to be 


8 — a2 + a4 6-a gq; 
= 22 ie and f= Ht + Gat 


By rm A 


(19.22) 
where A = (5 ~ a11)(8 — a22) — aq12@21. Since £, and £2 are now expressed entirely in the 
known values of the parameters, we only need to insert them into the trial solution (19.21) 
to get the definite expressions for the particular solutions. 

A more general version of the type of final-dernand vector discussed here is given in 
Exercise 19.3-1. 

The procedure for finding the complementary functions of (19.20) is no different from 
that presented in Sec. 19.2. Since the homogeneous version of the equation system is 
X41 — Ax = 0, the characteristic equation should be 





ibt—aAp= |2o" m2 | =o [cf (19.9)] 


=z, b= ay 
From this we can find the characteristic roots b} and bz and thence proceed to the re- 
maining steps of the solution process. 


Excess Demand and Output Adjustment 

The model formulation in (19.20) can also arise from a different economic assumption. 
Consider the situation in which the excess demand for each product always tends to induce 
an output increment equal to the excess demand. Since the excess demand for the first prod- 
uct in period ¢ amounts to 


axietaiaxartdie — 2a 
i nonce 
demanded supplied 


the output adjustment (increment) Ax,,, is to be set exactly equal to that level: 
Ax (= tye — M0 = aut + ary + die — 214 


However, if we add x),; to both sides of this equation, the result will become identical with 
{19.19). Similarly, our output-adjustment assumption will give an equation the same as 
(19.19’) for the second industry. In short, the same mathematical model can result from 
altogether different economic assumptions. 

So far, the input-output system has been viewed only in the discrete-time framework. 
For comparison purposes, let us now cast the output-adjustment process in the continugus- 
time mold. 

In the main, this would call for use of the symbol x;(7) in lieu of x;,z, and of the deriva- 
tive x;(f) in lieu of the difference Ax,,. With these changes, our output-adjustment 
assumption will manifest itself in the following pair of differential equations: 


iC) = aya (t) + aroxe(t} + a(t) — 11(t) 

g(t) = an nr(t) + ayxeft) + ay{t) — 39(1) 
At any instant of time ¢ = ty, the symbol x;(¢o) tells us the rate of output flow per unit of 
time (say, per month) that prevails at the said instant, and d;(to) indicates the final demand 


per month prevailing at that instant. Hence the right-hand sum in each equation indicates 
the rate of excess dernand per month, measured at ¢ = fp. The derivative xf) at the left, 


606 Part Five Dynamic Analysis 


Example 2 


on the other hand, represents the rate of output adjustment per month called forth by the ex- 
cess demand at f = fp. This adjustment will eradicate the excess demand (and bring about 
equilibrium) in a month’s time, but only if both the excess demand and the output adjust- 
ment stay utchanged at the current rates. In actuality, the excess demand will vary with 
time, as will the induced output adjustment, thus resulting in a cat-and-mouse game of 
chase. The solution of the system, consisting of the time paths of the output x;, supplies a 
chronicle of this chase. If the solution is convergent, the cat (output adjustment) will even- 
tually be able to catch the mouse (excess demand), asymptotically (as ¢ > 00). 

Afier proper rearrangement, this system of differential equations can be written in the 
format of (19.13') as follows: 


I += Ax =d (19,23) 


»_ fie) ET) ay a2 a(s) 
here ox’ =|! x= A= . d= 
. [i] * a a) ay at) 
(the prime denoting derivative, not transpose), The complementary functions can be found 


by the method discussed earlier. In particular, the characteristic roots are to be found from 
the equation 


rtl-any A 


= of. (19.16 
—a2 rl —ay 0 RE CU9TON] 


ri +( - All = 








As for the particular integrals, if the final-demand vector contains nonconstant functions 
of time d\(t) and @)(f) as its elements, a modification will be needed in the method of 
solution. Let us illustrate with a simple example. 


Given the final-demand vector 
7 wert Par] gut 
a= [38s “Lia |® 
where A; and p are constants, find the particular integrals of the dynamic model (19.23). 
Using the method of undetermined coefficients, we can try solutions of the form 


x(t) = Bre’, which imply, of course, that x/(t) = pAje”'. in matrix notation, these can be 
written as 


— | Bi] gat 
ke [*] e (19.24) 


and x=p [‘ | et = [é ’] ih | e*' — [cf. footnote in Example 1] 


Upon substituting into (19.23) and canceling the common (nonzero) scalar multiplier e*, 
we obtain 

e Oo) Bi + T-a -az |] Br] “| 

QO pli —a 1 ~ o2 | | 2 42 


pti-ay “a2 Arf _ [a 
[ = 021 pt1- m| [7 ~ [3] (19.25) 


or 


Chapter 19 Simultuneous Differential Equations and Difference Equations 607 


Tf the leftmost matrix is nonsingular, we can apply Cramer’s rule and determine the values 
of the coefficients 8; to be 


_ Aile + 1 = ana) + zane 


a 4 19,2. 
25° 
dale +4 = ai) + Avan ( ) 
ry 


where A =(p + 1—ay)(o + | — a7) — a)2@2). The undetermined coefficients having 
thus been determined, we can introduce these values into the trial solution (19.24} to obtain 
the desired particular integrals. 


Capital Formation 
Another economic consideration that can give rise to a dynamic input-output system is cap- 
ital formation, including the accumulation of inventory. 

In the static discussion, we only considered the output level of each product needed to 
satisfy current demand. The needs for inventory accumulation or capital formation were 
cither ignored, or subsumed under the final-demand vector, To bring capital formation 
into the open, let us now consider—along with an input-coefficient matrix A = [a,;]--a 
capital-coefficient matrix 

1 cy cL 
C=laJ= a ha 


where ¢;; denotes the dollar worth of the ith commodity needed by the jth industry as new 
capital (either cquipment or inventory, depending on the nature of the th commodity) as a 
result of an output increment of $1 in the jth industry. For example, if an inctease of $1 in 
the output of the soft-drink (jth) industry induces it to add $2 worth of bottling equipment 
(ith commodity), then ¢;; = 2. Such a capital coefficient thus reveals a marginal capital- 
output ratio of sarts, the ratio being limited to one type of capital (the ith commodity) only. 
Like the input coefficients a;;, the capital coefficients are assumed to be fixed. The idea is 
for the economy to produce each commodity in such quantity as to satisfy not only the 
input-requirement demand plus the final demand, but also the capital-requirement demand 
for it, 

If time is continwous, output increment is indicated by the derivatives x;(t); thus the 
output of cach industry should be set at 


ni} = anxif) tapn() tena (H+enxy(O + a(t) 
XA(t) = ax (t) + anx2lf) tenx(f) tenxsf) + 42(d) 
—— 


input requirement capital requirement final demand 





Tn matrix notation, this is expressible by the equation 
Ix = Ax+Cx'+d 
or 


Cx'+(A-Dx =-d (19,26) 


608 Part Five Dynantic Analysis 





If time is discrete, the capital requirement in period ¢ will be based on the output incre- 
ment x; — Xiv-1 (= Axz, 1): thus the output levels should be set at 


fe] [e ped [: ce |e) (‘| 

= - + 

S21 a) Gy | X20 ey C1 ear] ay, 
ll 


“eS 











input reqsuternent i] equirettett tinal demand 


a 
or Ix, = Ax, + C(x, — 1-1) ~ dh 


By shifting the time subsctipts forward one period, and collecting terms. however. we cant 
write the equation in the form 


(P= A= Og 4 OX, = ha (19.27) 


The diflerential-equation system (19.26) and the difference-equation system (19.27) can 
again be solved, of course. by the method of Sec. 19.2. It also goes without saying that these 
two matrix equations are both extendible to the z-industry case simply by an appropriate 
redefinition of the matrices and a corresponding change in the dimensions thercot. 

In the preceding, we have discussed how a dynamic input-output model can arise frora 
such considerations as time lags and adjustment mechanisms. When similar considerations 
are applied to general-equilibrium market models, the tatter will tend to become dynamic 
in much the same way. But, since the formulation of such models is analogous in spirit to 
input-output models, we shall dispense with a format discussion thereof! and merely refer 
cs in Exercises 19.3-6 and 19.3-7, 








you to the illustrative 





EXERCISE 19.3 
1. In Example 1, if the final-demand vector is changed to d = [i what will the 
ha 


particular solutions be? After finding your answers, show that the answers in Example 1 
are merely a special case of these, with a1 = Az =1. 
2. (a) Show that (19.22) can be written more concisely as 


(31- A)B=u 
(4) Of the five symbols used, which are scalars? Vectors? Matrices? 
(©) Write the solution for 8 in matrix farm, assuming (6f — A} to be nonsingular. 
3, (a) Show that (19.25) can be written more concisely as 
(at +f-A)psa 
(8) Which of the five symbols represent scalars, vectors, and matrices, respectively? 
(cq) Write the solution for f in matrix form, assuming (p} 4 / — A) to be nonsingular. 


3 4 12)¢ 
4, Given A= [? | and d, = (ra) for the discrete-time production-lag input 
Tw W To. 
output model described in (19.20), find (a) the particular solutions; (b) the comple- 


mentary functions; and (c) the definite time paths, assuming initial outputs x1,9 = gv 


and x29 = BR (Use fractions, not decirnais, in all calculations.) 


Chapter 19 Simultaneous Differential Equations and Difference Equations 609 


a 4 1710 
. et . . . 
5. Given A= [i ‘| and d= FE ra] for the continuous-time output-adjustment 
3 2 e 
10 10 


input-output model described in (19.23), find (@) the particular integrals; (b) the com- 
plementary functions; and (q the definite time paths, assuming initial conditions 
x4(0) = B and xp(0) = 2. (Use fractions, not decimals, in all calculations.) 

6. In an n-commodity market, all Qo: and Q,; (with /=1,2,..,,1 can be considered 
as functions of the n prices Pj,..., P,, and so can the excess demand for each 
commodity £; = Qu — Q,;. Assuring linearity, we can write 


Ey = 9 +4 Py + aygPet ++ + ain Pr 
£2 = dao + Ga Pr + G22 P2 + +++ + don Pr 


En = Oy + py Py > GagP2+-+-4 GaP 


or, in matrix notation, 

E=a+AP 

(a) What do these last four symbols stand for—scalars, vectors, or matrices? What are 
their respective dimensions? 

(b) Consider all prices to be functions of time, and assume that dP; /dt = ajE; (i =1, 
2,..., 1). What is the economic interpretation of this last set of equations? 

{o) Write out the differential equations showing each dP; /dt to be a linear function of 
the 7 prices. 

(d) Show that, if we let P’ denote the nx 1 column vector of the derivatives dP; /dt, 
and if we let « denote an 9 x n diagonal matrix, with o,«2, ..., a (in that order) 
in the principal diagonal and zeros elsewhere, we can write the preceding differential- 
equation systern in matrix notation as P’ .w@AP = aa. 

7. For the n-commodity market of Prob. 6, the discrete-time version would consist of a set 
of difference equations AP); =aj£;4(/=1,2,...,n), where Ej. = a9 +a, Phy + 
Qi2Patt oo + GaPate 
(a) Write out the excess-demand equation system, and show that it can be expressed 

in matrix notation as E; = a+ AP. 

(b) Show that the price adjustment equations can be written as Pry; ~ Py =aEy, 
where a is the n x n diagonal matrix defined in Prob. 6. 

( Show that the difference-equation system of the present discrete-time model can 
be expressed in the form Pri. —(/ +a@A)P; = aa. 


19.4 The Inflation-Unemployment 
Model Once More 


Having illustrated the multisector type of dynamic systems with input-output models, we 
shall now provide an cconomi¢ cxample of simultancous dynamic equations tn the one- 
sector setting. For this purpose, the inflation-unemployment model, already encountered 
twice before in two different guises, can be called back into service once again. 





610 Part Five Dynamic Analysis 


Simultaneous Differential Equations 


In Sec. 16.5 the inflation-unemployment model was presented in the continuous-time 
framework via the following three equations: 


p=u-T-pU+on {a,B>00<g<1) (16.33) 


&: 
a = se) @<j2)) (16.34) 
Hun) k=) (16.35) 


except that we have adopted the Greek letter here to replace m in (16,35) in order to avoid 
confusion with our earlier usage of the symbol m in the methodological discussion of Sce. 19.2. 
In the treatment of this model in Sec. 16.5, since we were not yet equipped then to deal with 
simultaneous dynamic equations, we approached the problem by condensing the mode] into a 
single equation in one variable. That necessitated a quite laborious process of substitutions and 
eliminations. Now, in view of the coexistence of two given patterns of change in the model for 
wand U, we shall treat the model as one of two simultaneous differential equations. 

When (16.33) is substituted into the other two equations, and the derivatives dx /di = 
w(t) and dU//dt = U"(t) written more simply as 7‘ and U’, the mode! assumes the form 


7 PIGS gh + BU = f@-7) (19.28) 
u'- ken +kBU = k(a —T — p) 


or, in matrix notation, 


1 O}/ x JU -g) JB]) 2) _| J@—T7) , 
[OTe a? tele] = lhe" sa] 
ae i 


J M 
From this system, the time paths of 7 and LU’ can be found simultancously. Then. if desired, 
we can derive the p path by using (16.33). 


Solution Paths 

To find the particular integrals, we can simply set 7’ = U' = 0 (to make w and U station- 
ary over time) in (19.28') and solve for 7 and U. In our earlier discussion, in (19.14), such 
solutions were obtained through matrix inversion, but Cramer’s rule can certainly be used, 
too. Either way, we can find that 


I 


=p and U= ; fw -7F-(1-g)u) (19.29) 


The result that 7 = yz (the equilibrium expected rate of inflation equals the rate of mone- 
tary expansion) coincides with that reached in Sec, 16.5, As to the rate of unemployment 
U, we made no attempt to find its equilibrium level in that section. If we did (on the basis 
of the differential equation in U given in Exercise 16.5-2), however, the answer would be no 
different from the U solution in (19.29). 

Turning to the complementary functions, which are based on the trial solutions me” and 
ne’, we can determine m, #, and r from the reduced matrix equation 


wrap] "]=0 [from (19.15)] 


Example 1 


Chapter 19 Simultaneous Differential Equations and Difference Equations 611 


which, in the present context, takes the form 


rtj-g) JB |) m]_]o 
-kg rt+tkB]i[a} [0 (19.30) 
To avoid trivial solutions for m and n ftom this homogeneous system, the determinant of 
the coefficient matrix must be made to vanish; that is, we require 


J+ M| =r? + [eet iC — ei + Api = 0 (19.31) 


This quadratic equation, a specific version of the characteristic equation r? + ayr + 
a = 0, has coefficients 
a =hBt+jQ-g) and a, =kBj 

And these, as we would expect, are precisely the a; and a) values in (16.37")—a single- 
equation version of the present model in the variable . As a result, the previous analysis 
of the three cases of characteristic roots should apply here with equal validity. Among other 
conclusions, we may recail that, regardless of whether the roots happen to be real or com- 
plex, the real part of cach root in the present model turns out to be always negative. Thus 
the solution paths are always convergent. 


Find the time paths of x and U, given the parameter values 
1 3 1 
a-Tee A=3 g=1 j=> and k=- 
Since these parameter values duplicate those in Example 1 in Sec. 16.5, the results of the 
Present analysis can be readily checked against those of the said section. 
First, it is easy to determine that the particular integrals are 


=u and g=3(2)= 


345 [by (19.29)] (19,32) 


1 
18 
The characteristic equation being 


243,422 
Mtgrts=0 [by 19,31)] 


the two roots turn out to be complex: 


1 3,9 9 3 3. . 3 3 
mih=5 (3 aya- ;) =-ja7 (wih ha —J and y= ;) (19.33) 
Substitution of the two roots {along with the parameter values) into (19.30) yields, respec- 
tively, the matrix equations 


3 9 
—Z-1) ri m™ 0 
4 1G 4 = fromm ==2+ i] (19.34) 
a 2 i n 0 
3 qil+i) 1 
3 F 9 
gC) G m™ 8 3 3 , 
1 = [rom n=5-j (19.34’) 


612° Part Five Dynamic Analysis 


Since, and r are designed—via (19.31)—to make the coefficient matrix singular, each of 
the preceding two matrix equations actually contains only one independent equation, 
which can determine only a proportionality relation between the arbitrary constants m; and 
n;. Specifically, we have 


1 . 1 . 
<1 -)m =n and =U 4+fm2=n2 
3 3 
The complementary functions can, accordingly, be expressed as 
we] _ [me + me 
Uc} 7 | met! + ager 
ot merit + mer 
mer +nge 


ont | (1 + 2) Cos vt + (my — ma)i sin vt 
(m +2) cosvt+ (m1 — m)isinve 


| [by (16.11)] 


| [by (16.24)] 


If, for notational simplicity, we define new arbitrary constants 
As=mi+mz and Ag=(m, -mz)i 
it then follows that" 
1 1 
m+nz= GAs ~ As) (ny — n2)i = 3(As + As) 


So, using these, and incorporating the 4 and v values of (19.33) into the complementary 
functions, we end up with 


3 
He As CoS —t + Agsin ot 
3134 4 4 
=e 74 3 | (19.35) 
-(As—A “tee in= 
Uc 3 5 6) cos t+ g(As + As)sin 5 


Finally, by combining the particular integrals in (19.32) with the above complementary 
functions, we can obtain the solution paths of x and U. As may be expected, these paths 
are exactly the same as those in (16.43) and (16.45) in Sec. 16.5. 


Simultaneous Difference Equations 

The simultaneous-equation treatment of the inflation-unemployment model in discrete 
time is similar in spirit to the preceding continuous-time discussion. We shall thus merely 
give the highlights, 


' This can be seen from the following: 
1 . 1 1 
ny tng = 31 ~ Amy + 31 + Bima = lms + ma) — (rm © m2)i] 


1 
= 7(As — As) 


(m - 2 = 


to -im,- ia +fmg)i= leer — mz) — (my + ma)i}i 


= Mista) =-11 


Chapter 19 Simultaneous Differential Equations and Difference Equations 613 


The model in question, as given in Sec. 18.3, consists of three equations. two of which 
describe the patterns of change of x and U, respectively: 


pr =a—T BU, + gn, (18.18) 
Mee — y= (Pr ~ 7) (18.19) 
Un. — U, = — k= pry) (18.20) 


Eliminating p, and collecting terms, we can rewrite the model as the diflerence-equation 


system 
1 0 mails |-U-stys) JB |) 
kg 14+ Bk |] Ut 0 -1| LG 


a rr 
-| i(@-T) | (19.36) 
K(a—T —p) 


Solution Paths 
If stationary equilibriums exist, the particular solutions of (19.36) can be expressed as 
T=7, = 7,4, and U =U; = U4). Substituting 7 and U into (19.36), and solving the 
system (by matrix inversion or Cramer’s rule), we obtain 
- | 

w= and U=sle-T-( — ge] (19.37) 
The value is the same as what was found in Scc, 18.3. Although we did not find in the 
latter section, the information in Exercise 18.3-2 indicates that 7 = yu, which agrees with 
(19.37). In fact, you may note, the results in (19.37) are also identical with the intertempo- 
ral equilibrium values obtained in the continuous-time framework in (19.29). 


The search for the complementary functions, based this time on the trial solutions mb’ 
and nb, involves the reduced matrix equation 


(BI + w/t ]=o 


or, in view of (19,36), 


b-U-j+ig) ip my} _fo 
Pe anal] [T= [a] 0938 


In order to avoid trivial solutions from this homogeneous syslem, we require 


jbJ + Kl =(1+ Bie —[l+9f +1— fl + pile 
+(L-j+ fg) =0 (19.39) 


The normalized version of this quadratic equation is the characteristic equation b* + 
a) b + az = 0, with the same a, and a2 coeflicients as in (18.24) and (18.33) in Sec. 18.3. 
Consequently, the analysis of the three cases of characteristic roots undertaken in that 
section should cquaily apply here. 

For cach root, 6;, (19.38) supplies us with a specific proportionality relation between 
the arbitrary constants m, and 1;, and these enable us to link the arbitrary constants in the 


614 Part Five Dynamic Analysis 


complementary function for (/ to those in the complementary function for 7, Then, by 
combining the complementary functions and the particular solutions, we can get the time 
paths of w and U. 





EXERCISE 19.4 


1. Verify (19.29) by using Cramer's rule. 


2, Verify that the same proportionality relation between m1 and 1; emerges whether we 
use the first or the second equation in the systern (19.34). 


3, Find the time paths (general solutions) of x and U, given; 


1 1 
pee 2U+ 37 
‘= lip-n) 
a =4 p-7 
uate p) 
aa hn P, 
4. Find the time paths (general solutions) of and U, given: 
1 1 1 
(a) P= 5 BU t om (b) Prom —4Ur +m 
1 1 
Meat — R= 7 (Rem) met =m = G(Pe~ td) 
Ura Up = (ul - Ped Urs — Uy = ~( ~ Pra) 


19,5 Two-Variable Phase Diagrams 





The preceding sections have deait with the quantitative solutions of linear dynamic systems. 
In the present section, we shall discuss the quaditative-graphic (phase-diagram) analysis of 
a nonlinear differential-cquation system. More specifically, our attention will be focused on 
the first-order differential-cquation system in two variables, in the general form of 

x(t) = fly) 

v(t) = g(x. y) 








Note that the time derivatives x‘(f) and y'(¢) depend only on x and y and that the variable ¢ 
does not enter into the fand g functions as a separate argument. This feature, which makes 
the system an autonomous system, is a prerequisite for the application of the phase-diagram 
technique.” 

The two-variable phase diagram, like the one-variable version in Sec. 15.6, is limited in 
that it can answer only qualitative questions—those concerning the location and the 
dynamic stability of the intertemporal equilibrium(s). But, again like the one-variable 
version, it has the compensating advantages of being able to handle nonlinear systems as 
comfortably as linear ones and to address problems couched in terms of general functions 
as rcadily as those in terms of specific ones. 


* In the one-variable phase diagram introduced earlier in Sec. 15.6, the equation dy/dt = Fy) is also 
restricted to be autonomous, being forbidden to have the variable tas an explicit argument in the 
function f. 


Chapter 19 Simutranenis Differential Fquarions and Diftévence Equations 615 


The Phase Space 

When constructing the one-variable phase diagram (Fig. 15.3) lor the (autonomous) difler- 
ential equation dy/dt = f(y), we simply plotted ¢y/dr against » on the two axes in a two- 
dimensional phase space. Now that the number of variables is doubled, however, how can 
we manage to meet the apparent need for more axes? The answer, fortunately, is that the 
2-space is all we need. 

To see why this is feasible, observe that the most crucial task of phase-diagram con- 
struction is to determine the direction of movement of the variable(s) over time. It is this in- 
formation, as embodied in the arrowheads in Fig. 15.3, that enables us to derive the final 
qualitative inferences. For the drawing of the said arrowheads, only two things are required: 
(J) a demarcation line—call it the “dy/dz = 0” line—that provides the locale for any 
prospective equilibrium(s) and, more importantly, separates the phase space into two re- 
gions, one characterized by dy/dt > 0 and the other by dy/dt < 0 and (2) a real line on 
which the increases and decreases of y that are implied by any nonzero values of dy/dt can 
be indicated. In Fig. 15.3, the demarcation line cited in item | is found in the horizontal 
axis. But that axis actually also serves as the real line cited in item 2. This means that the 
vertical axis, for dy/dt, can actually be given up without loss, provided we take care to 
distinguish between the dy/dt > 0 region and the dy/dt < 0 region—say. by labcling the 
former with a plus sign, and the latter with a minus sign. This dispensability of onc axis is 
what makes feasible the placement of a two-variable phase diagram in the 2-space. We now 
need Ave teal lines instead of one. But this is automatically taken cate of by the standard x 
and » axes of a two-dimensional diagram. We now also need fro demarcation Hines 
(or curves), one for ¢x/dt = 0 and the other for dv/dt = 0. Bul these are both graphable 
in a two-dimensional phase space. And once these are drawn, il would not be difficult to 
decide which sides of these lines or curves should be marked with plus and minus signs, 
respectively. 





The Demarcation Curves 

Given the following autonomous differential-cquation system 
= f(y) 

gl, ¥) 








(19.40) 





y 


where x’ and y’ are short for the time derivatives x'(f) and y(2), respectively, the two 
demarcation curves—to be denoted by x‘ = 0 and y’ = 0-represent the graphs of the 
two equations 


f(xy) =0 — [x = Oeurve] (19.41) 
ax,y)=0  [y' =Ocurve] (19.42) 


If the specific form of the f function is known, (19.41) can be solved for y in terms of x and 
the solution plotted in the xy plane as the x’ = 0 curve. Even if not, however, we can 
nonetheless resort to the implicit-function rule and ascertain the slope of the x’ = 0 curve 
to be 

dy af fe , 

= =- —s 0 19.43 
dtlpcy af fay f eo) ( ) 





616 Part Five Dynamic Anabisis 


FIGURE 19.1 


¥ 








0 t 


As jong as the signs of the partial derivatives f, and /, (# 0) are known, a qualitative clue 
to the slope of the x’ = 0 curve is available from (19.43). By the same token, the slope of 
the y’ = 0 curve can be inferred from the derivative 
dy Re 
2) Eg 40) (19.44) 
dx | 0.9 & 
For a more concrete illustration, let us assume that 
fe<O0 f,>0  g,>0 and g,<0 (49.45) 
Then both the x’ = 0 and y’ = 0 curves will be positively sloped. If we further assume that 
i 8s 
#8 
fh &y 
then we may encounter a situation such as that shown in Fig, 19.1, Note that the demarca- 
tion lines are now possibly curved. Note, also, that they are now no longer required to 
coincide with the axes. 

The two demarcation curves, intersecting at point £, divide the phase space into four 
distinet regions, labeled T through IV. Point #, where x and y are both stationary 
(x' = y’ = 0), represents the intertemporal equilibrium of the system. At any other point, 
however, either x or y (or both) would be changing over time, in directions dictated by the 
signs of the time derivatives x’ and y’ at that point. In the present instange, we happen to 
have x’ > 0. (x’ < 0) to the left (right) of the x’ = 0 curve; hence the plus (minus) signs on 
the left (right) of that curve. These signs are based on the fact that 

axé 
Fe ate 20 thy 19.40) and 19.45)] (19.46) 
oS 
which implies that, as we move continually from west to east in the phase space (as x in- 
creases), x’ undergoes a steady decrease, so that the sign of x’ must pass through three 
stages, in the order +, 0, —. Analogously, the derivative 


[x' = 0 curve steeper than» = 0 curve} 


= =, <0 — [by(19.40) and(19.45)] (19.47) 
oy 


FIGURE 19.2 


Chapter 19 Simultaneous Differential Equations and Difference Equations 617 


implies that, as we move continually from south to north (as y increases), y" steadily 
decreases, so that the sign of y’ must pass through three stages, in the order +, 0, ~. Thus 
we are led to append the plus signs below, and the minus signs above, the »’ = 0 curve in 
Fig, 19.1. 

On the basis of these plus and minus signs, a set of directional arrows can now be drawn 
to indicate the intertemporal movement of x and y. For any point in region |, x’ and y’ are 
both negative. Hence x and y must both decrease over time, producing a westward move- 
ment for x, and a southward movement for y. As indicated by the two arrows in region I, 
given an initial point located in region J, the intertemporal movement must be in the gen- 
eral southwestward direction. The exact opposite is true in region II, where x’ and y’ are 
both positive, so that both the x and y variables must increase over time. In contrast, x’ and 
y’ have different signs in region I. With x’ positive and vy’ negative, x should move east- 
ward and y southward. And region 1V displays a tendency exactly opposite to region II. 





Streamlines 

For a better grasp of the implications of the directional arrows, we can sketch a series of 
streamlines in the phase diagram. Also referred to as phase trajectories (or trajectories for 
short) or phase paths, these streamlines serve to map out the dynamic movement of the 
system from any conceivable initial point. A few of these are illustrated in Fig. 19.2, which 
reproduces the x’ = 0 and y’ = 0 curves in Fig 19.1. Since every point in the phase space 
must be located on one streamline or another, there should exist an infinite number of 
streamlines, all of which conform to the directional requirements imposed by the xy arrows 
im every region. For depicting the general qualitative character of the phase diagram, how- 
evet, a few representative streamlines should normally suffice. 

Several features may be noted about the streamlines in Fig. 19,2. First, all of them hap- 
pen to lead toward point #. This makes # a stable (here, globally stable) intertemporal equi- 
librium. Later, we shall encounter other types of streamline configurations. Second, while 
some streamlines never venture beyond a single region (such as the one passing through 
point 4), others may cross over from one region into another (such as those passing through 
8 and C). Third, where a streamline crosses over, it must have cither an infinite slope 
(crossing the x’ = 0 curve) or a zero slope (crossing the y’ = 0 curve), This is duc to the 


y 











618 Part Five Dynamic Analysis 


fact that, along the x’ = 0{y’ = 0) curve, x(y) is stationary over time, so the streamline 
must not have any horizontal (vertical) movement while crossing that curve, To ensure that 
these slope requirements are consistently met, it would be advisable, as soon as the demar- 
cation curves have been put in place, to add a few short vertical sketching bars across the 
x’ = O curve and a few Aorizontal ones across the y’ = 0 curve, as guidelines for the draw- 
ing of the streamlines." Fourth, and last, although the streamlines do explicitly point out the 
directions of movement of x and y over time, they provide no specific information regard- 
ing velocity and acceleration, because the phase diagram does not allow for an axis for ¢ 
(time). It is for this reason, of course, that streamlines carry the alternative name of phase 
paths, as opposed to rime paths, The only observation we can make about velocity is qual- 
itative in nature: As we move along a streamline closer and closer to the x‘ = 0 (y' = 0) 
curve, the velocity of approach in the horizontal {vertical) direction must progressively 
diminish. This is due to the steady decrease in the absolute value of the derivative 
x =dx/dt(y' =dy/dt) that occurs as we move toward the demarcation line on which 
x‘(y’) takes a zero value. 


Types of Equilibrium 

Depending on the configurations of the streamlines surrounding a particular intertemporal 
equilibrium, that equilibrium may fall into one of four categories: (1) nodes, (2) saddle 
points, (3) foci or focuses, and (4) vortices or vortexes. 

A node is an equilibrium such that all the streamlines associated with it either flow non- 
cyclically toward it (stable node) or flow noncyclically away {tom it (wastable nade). We 
have already encountered a stable node in Fig. 19.2. An unstable node is shown in 
Fig. 19.3a. Note that in this particular illustration, it happens that the streamlines never 
cross over from region to region. Also, the x’ = 0 and y = 0 curves happen to be linear, 
and, in fact, they themselves serve as streamlines. 

A saddle point is an equilibrium with a double personality-—it is stable in some direc- 
tions, but unstable in others. More accurately, with reference to the illustration in 
Fig, 19.3h, a saddle point has exactly one pair of streamlines- -called the stable branches 
of the saddle point—that flow directly and consistently toward the equilibrium, and exactly 
one pair of streamlines—the unstable branches—that flow directly and consistently away 
from it. All the other trajectories head toward the saddle point initially but sooner or later 
tuttt away from it. This double personality, of course, is what inspired the name “saddle 
point.” Since stability is observed only on the stable branches, which are not reachable as a 
matter of course, a saddle point is generically classified as an unstable equilibrium. 

The third type of equilibrium, focus, is one characterized by whirling trajectories, all of 
which either flow cyclically toward it (stable focus), or flow cyclically away ftom it (unsta- 
ble focus). Figure 19,3¢ illustrates a stable focus, with only one streamline explicitly drawn 
in order to avoid clutter. What causes the whirling motion to occur? The answer lies in the 
way the x’ = 0 and y’ = 0 curves are positioned, In Fig. 19.3c, the two demarcation curves 
are sloped in such a way that they take turns in blockading the streamline flowing in a di- 
rection prescribed by a particular set of xy arrows. As a result, the streamline is frequently 
compelled to cross over from one region into another, tracing out a spiral. Whether we get 








' To aid your memory, note that the sketching bars across the x’ = 0 curve should be perpendicular to 
the x axis. Similarly, the sketching bars across the y’ = 0 curve should be perpendicular to the y axis. 


FIGURE 19.3 


Chapter 19 Sinultancous Differential Equations and Difference Equations 619 


¥ y 








{a) () 











ica) (ay 


a stable focus (as is the case here) or an unstable one depends on the relative placement of 
the two demarcation curves, But in either case, the slope of the streamline at the crossover 
points must still be either infinite (crossing x’ = 0} or zero (crossing y’ = 0). 

Finally, we may have a vortex (or center), This is again an equilibrium with whirling 
streamlines, but these streamlines now form a family of loops (concentric circles or ovals) 
orbiting around the equilibrium in a perpetual motion. An cxample of this is given in 
Fig. 19.3d, where, again, only a single streamline is shown. Inasmuch as this type of 
equilibrium is unattainable from any initial position away from point £, a vortex is auto- 
matically classified as an unstable equilibrium. 

All the illustrations in Fig. 19.3 display a unique equilibrium. When sufficient nonlin- 
earity exists, however, the two demarcation curves may intersect more than once, thereby 
producing multiple equilibria. In that event, a combination of the previously cited types of 








620 Part Five Dynamic Analysis 


intertemporal equilibrium may exist in the same phase diagram. Although there will then 
be more than four regions to contend with, the underlying principle of phase-diagram 
analysis will remain basically the same. 


Inflation and Monetary Rule a la Obst 

As an economic illustration of the two-variable phase diagram, we shall present a model 
due to Professor Obst,’ which purports to show the ineffectiveness of the conventional 
(hence the need for a new) type of countercyclical monetary-policy rule, when an “inflation 
adjustment mechanism” is at work. Such a model contrasts with our earlier discussion of 
inflation in that, instead of studying the implications of a given rate of monetary expansion, 
it looks further into the efficacy of two different monetary rufes, each prescribing a differ- 
ent set of monetary actions to be pursued in the face of various inflationary conditions. 

A crucial assumption of the model is the inflation adjustment mechanism 


dp M, - My Ma 
= =h{l- ; . 
ah ( iM ) h ( <) (h>60) (19.48) 





which shows that the effect of an execss supply of money (M, > Mz) is to raise the rate of 
inflation p, rather than the price level P. The clearance of the money market would thus 
imply not price stability, but only a stable rate of inflation, To facilitate the analysis, the 
second equality in (19.48) serves to shift the focus from the excess supply of money to the 
demand-supply ratio of money, M,/M@,, which we shal] denote by . On the assumption 
that My, is directly proportional to the nominal national product PQ, we can write 


Mg _ aPQ 
MOM, 





w= (a > 0) 


The rates of grawth of the several variables are then related by 





dujdt _ dajdt in dP {dt in dQ/di dM Jat 
i a P Q Ms 
[by (10.24) and (10.25)] 
=ptq-m [a =a constant] (19.49) 


where the lowercase letters p, g, and m denote, respectively, the rate of inflation, the (exoge- 
nous) rate of growth of the real national product, and the rate of monetary expansion. 

Equations (19.48) and (19.49), a set of two differential equations, can jointly determine 
the time paths of p and ., if, for the time being, m is taken to be exogenous. L’sing the sym- 
bols p’ and y' to represent the time derivatives p’{r) and y’(r), we can express this system 
more concisely as 


pshl—-w 


, (19.50) 

wes (p+q—me 
+ Norman P. Obst, “Stabilization Policy with an Inflation Adjustment Mechanism," Quarterly Journal of 
Economics, May 1978, pp. 355-359. No phase diagrams are given in the Obst paper, but they can be 
readily constructed from the model. 


FIGURE 19.4 





Chapter 19 Simultaneous Differential Fquations and Difference Fquations 621 





T 
i 
i 
1 
1 
{ 











+ 
o mag P 0 md) = ¢ P 


ia) (h} 





Given that 2 is positive, we can have p’ = 0 if and only if 1 — ¢ = 0. Similarly, since g is 
always positive, = 0 if and only if p + g — m = 0. Thus the p’ = 0 and yx’ = 0 demar- 
cation curves are associated with the equations 


B=! [p' = 0 curve] (19.51) 
p=m-g (w= 0 curve] (19.52) 


As shown in Fig, 19.44, these plot as a horizontal line and a vertical line, respectively, and 
yield a unique equilibrium at #, The equilibrium yalue jz = | means that in equilibrium M; 
and M, are equal, clearing the money market. The fact that the equilibrium rate of inflation 
is shown to be positive reflects an implicit assumption that m > q. 

Since the p' = 0 curve corresponds to the x‘ = 0 curve in our previous discussion, it 
should have vertical sketching bars. And the other curve should have horizontal ones. From 
(19.50), we find that 
ap! ayt 

op =-h<0 and Sel =p>0 (19.53) 

ou op 
with the implication that a northward movement across the p’ = 0 curve passes through the 
(+, 0, —) sequence of signs for p’, and an castward movement across the je" = 0 curve. the 
{—, 0, +) sequence of signs for 4’. Thus we obtain the four sets of directional arrows as 
drawn, which generate streamlines (only one of which is shown) thal orbit counterclockwise 
around point E. This, of course, makes £ a vortex. Unless the cconomy happens initially to 
be at E, it is impossible to attain cquilibrium. Insicad, there will be never-ceasing fluctuation. 

The preceding conclusion is, however, the consequence of an exogenous rate of mone- 
tary expansion. What if we now endogenize m by adopting an anti-inflationary monetary 
tule? The “conventional” monetary rule would call for gearing the rate of monetary expan- 
sion negatively to the rate of inflation: 





ma=mp) — m'(p) <0 [conventional monetary rule] (19.54) 


622 Park Five Dynamic Analysis 


Such a rule would modify the second equation in (19.50) ta 
w= [p+q-m(py]e (19.55) 
and alter (19.52) to 
p=mp)—4q [w’ = 0 curve under conventional monetary rule] (19.56) 


Given that m(p) is monotonic, there exists only one value of p—say, pj—that can satisfy 
this equation. Hence the new yx’ = 0 curve must still emerge as a vertical straight line, 
although with a different horizontal intercept p = m(p1) — q. Moreover, from (19.55), we 
find that 
Fea (= mpiie> 0 by (198) 

which is quatitatively no different from the derivative in (19.53). it follows that the direc- 
tional arrows must also remain as they are in Fig. 19.4. In short, we would end up with a 
vortex as before. 

The alternative monetary rule proposed by Obst is to gear #7 to the rate of change (rather 
than the /eve/) of the rate of inflation: 


m=m{(p') — m'(p')<0 [alternative monetary rule] (19.57) 
Under this rule, (19.55) and (19.56) will become, respectively, 
w= (p+q—mp ne (19.58) 


p=mp)-q — [vw =Ocurve under alternative monelary rule] (19.59) 





This time the ye’ = @ curve would become upward-sloping. For, differentiating (19.59} with 
Tespcct to jt via the chain rule, we have 

dp _ 

an 
so, by the inverse-function rule, dj/dp the slope of the jz’ = 0 cruve—is also positive. 
This new situation is illustrated in Fig. 19.46, where, for simplicity, the jz’ = 0 curve is 
drawn as a straight line, with an arbitrarily assigned slope.” Despite the slope change, the 
partial derivative 


nee =m'(p)(-h) > 0 [by (19.50)] 


ap! 

—=n>0 (from (19.58)] 

ap 
is unchanged from (19.53), so the 4 arrows should retain their original orientation in 
Fig. 19.4a. The streamlines (only one of which is shown) will now twirl inwardly toward 
the equilibrium at 4 = 1 and p= m(0)—q, where m(0) denotes m(p') evaluated at 
p’ = 9. Thus the alternative monetary rule is seen to be capable of converting a vortex into 
a stable focus, thercby making possible the asymptotic elimination of the perpetual fluctu- 
ation in the rate of inflation. Indeed, with a sufficiently flat 4’ = 0 curve, it is even possible 
to turn the vortex into a stable node. 


* The slope is inversely proportional to the absolute value of m'(p'). The more sensitively the rate of 
monetary expansion mis made to respond to the rate of change of the rate of inflation P, the flatter 
the yc’ = 0 curve will be in Fig. 19.4b. 


Chapter 19 Simulameous Differential Equations and Difference Equations 623 





EXERCISE 19.5 


}. Show that the two-variable phase diagram can also be used, if the model consists of a 
single second-order differential equation, y“(t) = f(y’, y), instead of two first-order 
equations. 

2. The plus and minus signs appended to the twe sides of the x’ = 0 and y’ = 0 curves in 
Fig. 19,1 are based on the partial derivatives 4x"/dx and ay’/ay, respectively. Can the 
same conclusions be obtained from the derivatives ax’/ay and ay’ /ax? 

3. Using Fig. 19.2, verify that if a streamline does not have an infinite (zero) slope when 
crossing the x’ = 0 (y' = 0) curve, it will necessarily violate the directional restrictions 
imposed by the xy arrows. 

4. As special cases of the differential-equation system (19,40), assume that 
(a) f,=0 fy>0 g@>0 and gy=O 
(b) f, =0 f,<0 <0 and g,=0 
For each case, construct an appropriate phase diagram, draw the streamlines, and 
determine the nature of the equilibrium. 

5. (a) Show that it is possible to produce either a stable node or a stable focus from the 

differential-equation system (19.40), if 
f.<0 fy>0 G& <0 and gy <9 

(d) What special feature(s) in your phase-diagram construction are responsible for the 
difference in the outcomes (node versus focus)? 

6. With reference to the Gbst model, verify that if the positively sloped j.’ = 0 curve in 
Fig. 19.4b is made sufficiently flat, the streamfines, although still characterized by 
crossovers, will converge to the equilibrium in the manner of a nade rather than a 
focus. 


19,6 Linearization of a Nonlinear 
Differential-Equation System 





Another qualitative technique of analyzing a nonlinear differential-cquation system is to 
draw inferences from the linear approximation to that system, to be detived from the Taylor 
expansion of the given system around its equilibrium.’ We learned in Sec. 9.5 that a linear 
(or even a higher-order polynomial} approximation to an arbitrary function (x) can give 
us the exact value of @(x) at the point of expansion, but will entail progressively larger 
errors of approximation as we move farther away from the point of expansion. The same is 
true of the linear approximation to a nonlinear system. At the point of expansion—here, the 
equilibrium point £---the linear approximation can pinpoint exactly the same equilibrium 
as the original nonlinear system. And in a sufficiently small neighborhood of £, the linear 
approximation should have the same general streamline configuration as the original sys- 
tem. As long as we are willing to confine our stabilily inferences to the immediate neigh- 
borhood of the equilibrium, therefore, the linear approximation could serve as an adequate 
source of information. Such analysis, referred to as local stability analysis, can be used 


t In the case of multiple equilibria, each equilibrium requires a separate linear approximation. 


624 Part Five Dynamic Anatysis 


cither by itself, or as a supplement to the phase-diagram analysis. We shall deal with the 
two-variable case only. 


Taylor Expansion and Linearization 
Given an arbitrary (successively differentiable) one-variable function (x), the Taylor 
expansion around a point xq gives the series 

$"(x0) 


7 (xm t- 





B(x) = (x0) + '(xo)(x — 40) + 
0 (x0) 


ar 





(x — xo)" + Re 


where a polynomial involving various powers of (x — xo) appears on the right. A similar 
structure characterizes the Taylor expansion of a function of two variables f(x, y) around 
any point (xo, yo). With two variables in the picture, however, the resulting polynomial 
would comprise various powers of (y — yq) as well as (x — x9)—in fact, also the products 
of these two expressions: 


ft.y) = fro, vo) + fCto, voll — x0) + fo, yo) — vo) 
1 
+ aylfextro, yo) — x0)? +2 fey(x0, voMa — x0}(Y — Yo) 
+ Syy(¥os YoY — YoY +--+ Rn (19.60) 


Note that the coefficients of the (x — xo} and (y — vo) expressions are now the partial 
derivatives of f all evaluated at the expansion point (xo, yo). 

From the Taylor series of a function, the linear approximation—or linearization for 
shott—is obtained by simply dropping all terms of order higher than one. Thus, for the one- 
variable case, the linearization is the following linear function of x: 


(x0) + (x0) — 0) 
Similarly, the linearization of (19.60) is the following linear function of x and y: 
(Xo. Yo) + Fe(X0. YX — X0) + fio, YoY — Yo) 


Besides, by substituting the function symbol g for fin this result, we can also get the cor- 
responding linearization of g(x, y). It follows that, given the nonlinear system 





f a 
f(y) (19.61) 
y= ay) 
its linearization around the expansion point (xp, yo) can be written as 
x= FU. Yo) + foo, Yo ~ Xo) + fio, YoY — Yo) (19.62) 


¥ = glXo, Yo) + Bx(Xos Yo — Xo) + gu(%0, YoY — Yo) 


If the specific forms of the functions f and g are known, then f (20, yo), felXo, Yo). 
(xo, Yo) and their counterparts for the g function can all be assigned specific values and 
the linear system (19.62) solved quantitatively. However, even if the fand g functions are 
given in general forms, qualitative analysis is still possible, provided only that the signs of 
fas Sys Ge, and gy are ascertainable. 


Chapter 19 Simultuncous Differential Equations and Difference Equations 625 


The Reduced Linearization 

For purposes of local stability analysis, the linearization (19.62) can be put into a simpler 
form. First, since our point of expansion is lo be the equilibrium point (¥, 7), we should re- 
place (xo, yo) by (¥. 7). More substantively, since at the equilibrium point we have 
x! = y’ = 0 by definition, it follows that 


(EP =eEF)=0 — by (19.61)] 


so the first term on the right side of each equation in (19.62) can be dropped. Making these 
changes, then multiplying out the remaining terms on the right of (19.62) and rearranging, 
we obtain another version of the linearization: 


VAP )e — fF, Py = — fe Pe — AHR WY 
¥ — BART) — By, Fy = —an(®, PIF — gy, YP 
Note that, in (19.63), each term on the right of the equals signs represents a constant, We 


took the trouble to separate out these constant terms so that we can now drop them all, to 
get to the reduced equations of the linearization, The result, which may be written in matrix 


notation as 
Flo oA Ss x]_[0 
[| ls antl) meh 


constitutes the reduced linearization of (19.61). Inasmuch as qualitative analysis depends 
exclusively on the knowledge of the characteristic roots, which, in turn, hinge only on 
the reduced equations of a system, (19.64) is all we necd for the desited local stability 
analysis. 

Going a step further, it may be observed that the only distinguishing property of the 
reduced linearization lies in the matrix of partial derivatives~ -the Jacobian matrix of the 
nonlinear system (19.61}—evaluated at the equilibrium (x, 7). Hence, in the final analysis, 
the local stability or instability of the equilibrium is predicated solely on the makeup of the 
said Jacobian. For notational convenience in the ensuing discussion, we shall denote the 
Jacobian evaluated at the equilibrium by .J, and its elements by a, 5, ¢, and d: 


_[h ) -(: 4 
ne=|t 2 en od (19.65) 


(19.63) 


It will be assumed that the two differential equations are functionally independent. Then we 
shall always have |/g| # 0. (For some cases where |./¢| = 0, sce Exercise 19.6-4.) 


Local Stability Analysis 
According to (19.16), and using (19.65), the characteristic equation of the reduced 
linearization should be 

r-a —b 


“oped =r —(a+d)r + (ad — be) =0 








It is clear that the characteristic roots depend critically on the expressions (@ + ¢) and 
(ad — bc). The latter is merely the determinant of the Jacobian in (19.65): 


ad — be = |Je| 


626 Part Five Dynamic Analysis 


And the former, representing the sum of the principal-diagonal elements of that Jacobian, 
is called the trace of J, symbolized by tr Jr: 
a+¢d=atrdy 

Accordingly, the characteristic roots can be expressed as 

te Je + Vr Jee — Ade 

2 

The relative magnitudes of (tr J)’ and 4| J;-| will determine whether the {wo roots are real 
or complex, that is, whether the time paths of x and are stcady or fluctuating. To check the 
dynamic stability of equilibrium, on the other hand, we need to ascertain the algebraic signs 


of the two roots. For that purpose, the following two relationships will prove to be most 
helpful: 





r= 


y 


rated, . (19.66) 

hi =|del [ef. (16.5) and (16.6)] (19.67) 
Case | (tr J)’ > 4[J,| In this case, the roots are real and distinct, and no fluctuation is 
possible. Hence the equilibrium can be either a node or a saddle point, but never a focus or 
vortex. In view that) # r3, there exist three distinct possibilities of sign combination: both 
roots negative, both roots positive, and two roots with opposite signs.’ Taking into account 

the information in (19.66) and (19.67), these three possibilities are characterized by: 


(on <¥n<0 3 |p| > ht Jp <0 
Gi) rn >4n>0 => [del > trl, > 0 
Gi) n> On<0 > Wel <Owde 20 


Under Possibility i, with both roots negative, both complementary functions x, and y, tend 
to zero as ¢ becomes infinite. The equilibrium is thus a stable node, The opposite is true 
under Possibility ii, which describes an unstable node, In contrast, with two roots of 
opposite signs, Possibility i# yields a saddle point. 


To sce this last case more clearly, recall that the complementary functions of the twa 
variables under Case ] take the general form 


Kp = Aye"! + Aye”? 
Ve = hy Aye + ky Aze”™™ 


where the arbitrary constants 4) and 4; are to be determined from the initial conditions. If 
the initial conditions are such that 4, = 0, the positive root r; will drop out of the picture, 
Icaving it to the negative root r) to make the equilibrium stable. Such initial conditions per- 
tain to the points located on the stable branches of the saddle point. On the other hand, if 
the initial conditions are such that 42 = 0, the negative root r2 will vanish from the scene, 
leaving it to the positive root +) to make the equilibrium unstable. Such initial conditions 
relate to the points lying on the unstable branches. Inasmuch as all the other initial condi- 
tions also involve 4; # 0, they must all give risc to divergent complementary functions, 
too. Thus Possibility iii yields a saddle point. 





t Since we have ruled out |{¢| = 0, no root can take a zero value. 


TABLE 19.1 
Local Stability 
Analysis of a 
Two-Variable 
Nonlinear 
Differential- 
Equation 
System. 


Chapter 19 Simultaneous Differential Equations and Difference Eynationy 627 


Case 2 (tr Jy = 4|J,| As the roots are repeated in this only two possibilitics of sign 


combination can arise: 








Gun <0 <0 + [ele Oude <0 
Ww re Om>0 = Wel > Ot de > 0 


These two possibilities are mere duplicates of Possibilities ¢ and #2, Thus they point to a 
stable node and an unstable node, respectively, 


Case 3 (tr J, < 4|J,| This time. with complex roots A + vi. cyclical fluctuation is pre- 
sent, and we must encounter either a focus or a vortex. On the basis of (1 9.66) and (19.67), 
we have in the present case 


teJg =ryp—ry = (A+ vi) + (A vi) = 2h 
Vel arin =(h-wi(h - vi) =P tee? 


Thus tr Jy has to take the same sign as i, whereas | /| is invariably positive. Consequently, 
there are three possible outcomes: 


(vi) A<O = [Jel > Otrde <0 
(i) heO > Jel > Orde > 0 
(vi) h=0 => [Jy | > Ott Je = 9 


These are associated, respectively, with damped fluctuation, explosive fluctuation, and 
uniform fluctuation. In other word, Possibility vi implies a stable focus: Possibility e//, an 
unstable focus; and Possibility vii, a vortex. 


The conclusions from the preceding discussion are summarized in Table 19.1 to facili- 
tate qualitative inferences (tom the signs of |Je| and tt Je. Three features of the table are 
especially noteworthy. First, a negative |./,| is exclusively tied (o the saddle-point type of 
equilibrium. This suggests that |/y| < 0 is a necessary-and-sullicient condition for a 
saddle point. Second, a zero value for tt J occurs only under two circumstances when 
there is a saddle point or a vortex. These two circumstances are, however. distinguishable 
from each other by the sign of |./,|. Accordingly, a zero tr /e coupled with a positive {/;-| 
is necessary-and-sufficient for a vortex. Third, while a negative sign for tr J, is necessary 
for dynamic stability, it is not sufficient, on account of the possibility of a saddle point. 








Sign of Sign of Type of 
Case Vel tj, Equilibrium 
Vette fe) > aie! + - Stable node 
+ + Unstable node 
- +,0,~ Saddle point 
2. (6 J = 4h, + - Stable node 
+ + Unstable node 
3. (tr fp) < 4! + ~ Stable focus 
+ + Unstable focus 
+ Uy} Vortex 





628 Part Five Dynemic Analysis 


Example 1 


Example 2 


Example 3 





Nevertheless, when a negative tr J is accompanied by a positive |/,;|, we do have a 
necessary-and-sufficient condition for dynamic stability. 

The discussion leading to the summary in Table 19.1 has been conducted in the context 
ofa linear approximation to a nontinear system. However, the contents of that table are ob- 
viously applicable also to the qualitative analysis of a system that is /inear to begin with. In 
the latter case, the elements of the Jacobian matrix will be a set of given constants, so there 
is no necd to evaluate them at the equilibrium. Since there is no approximation process 
involved, the stability inferences will no longer be “local” in nature but will have global 
validity. 








Analyze the local stability of the nonlinear system 


f(x, Y) = ay -2 
9%, Y= 2x y 
First, setting x’ = y’ = 0, and noting the nonnegativity of x and y, we find a single equilib- 
rium Eat (X, ¥) = (1, 2). Then, by taking the partial derivatives of x’ and y’, and evaluating 
them at £ we obtain 


Le then 2 thak =] 
PT Lo Wham 2 Moa L217 


Since |},| = —4 is negative, we can immediately conclude that the equilibrium is locally a 
saddle point. 

Note that while the first row of the Jacobian matrix originally contains the variables y and 
x, the second row does not. The reason for the difference is that the second equation in the 
given system is originally linear, and requires no linearization. 


(, y= 0) 





Given the nonlinear system 
Kaxta-y 
yol-y 
we can, by setting x‘ = y’ = 0, find two equilibrium points: £; = (1,1) and £2 =(—1,1). 


Thus we need two separate linearizations. Evaluating the Jacobian [75 7 | at the two 


equilibriums in turn, we obtain 


2-1 -2 -1 
tom |G 4] and ta=| a a] 
The first of these has a negative determinant; thus E) = (1, 1) is locally a saddle point. From 


the second, we find that |/;,| = 2 and tr f,, = -3. Hence, by Table 19.1, £2 = (-1,1) is 
locally a stable node under Case 1. 


Does the {inear system 

x=x-y+2 

yoaxty+4 
possess a stable equilibrium? To answer such a qualitative question, we can simply concen- 
trate on the reduced equations and ignore the constants 2 and 4 altogether. As may be 


expected from a linear system, the Jacobian [i a 


14 | has as its elements four constants. 


Example 4 


Example 5 


Chapter 19 Simultaneous Differential Equations and Difference Equations 629 


Inasmuch as its determinant and trace are both equal to 2, the equilibrium falls under 
Case 3 and is an unstable focus. Note that this conclusion is reached without having to solve 
for the equilibrium. Note, also, that the conclusion is in this case globally valid. 


Analyze the local stability of the Obst model (19.50), 
yah —p) 
w= (pt+q- mu 
assuming that the rate of monetary expansion m is exogenous (no monetary rule is fol- 


lowed). According to Fig. 19.4a, the equilibrium of this model occurs at E = (7,2) = 
(m-— q,1). The Jacobian matrix evaluated at Fis 


ap’ ap’ 


I= ap ae | [0 -A _[0 -A 
ay an’ | “Le pta-mergy Ll) 0 
ap de F 


Since |j,| = > 0, and tr J, = 0, Table 19.1 indicates that the equilibrium is locally a vor- 
tex. This conclusion is consistent with that of the phase-diagram analysis in Sec. 19.5, 


Analyze the lacal stability of the Obst model, assuming that the alternative monetary rule is 
as follows: 


p=h( =p) [from (19.50)] 
a =[ptq—m(p)le [from (19.58)] 


Note that since p’ is a function of x, the function m(p’) is in the present model also a 
function of jz. Thus we have to apply the product rule in finding ay’ /ay. At the equilibrium 
E, where p’ = p' = 0, we have @=1 and P=m(0)—-q. The Jacobian evaluated at F is, 
therefore, 


_fo -h fo -h 
n=[f o+q=m(p) (pity |= iva 


where m'(0) is negative by (19.57). According to Table 19.1, with !/,| = #> 0 and 
tj, = m'(0)h < 0, we can have either a stable focus or a stable node, depending on the 
relative magnitudes of (tr I) and 4|},|. To be specific, the larger the absolute value of the 
derivative m'(0), the larger the absolute value of tr /, will be and the more likely (tr j,)* will 
exceed 4|/,|, to produce a stable node instead of a stable focus. This conclusion is again 
consistent with what we teamed from the phase-diagram analysis. 





EXERCISE 19.6 


1, Analyze the local stability of each of the following nonlinear systems: 
(a) x’ =e*—1 @ «=1-e 
ys yer y=Sx-y 
{b) x =x 42y (d) xe 4Reyty 
yoxrsy yi =x(14y?) 


630 Part Five Dynamic Analysis 


2. Use Table 19.1 to determine the type of equilibrium a nonlinear systern would have lo- 
cally, given that: 
() f=O0 f>0 g>O0 and g=0 
(b) f, =0 <0 g <0 and g=0 
(<0 f>0 go <0 and gy <0 
Are your results consistent with your answers to Exercises 19.5-4 and 19.5-5?7 
3. Analyze the local stability of the Obst model, assuming that the conventional monetary 
Tule is followed. 
4. The following two systerns both ‘possess zero-valued Jacobians. Construct a phase 
diagram for each, and deduce the locations of all the equilibriums that exist: 
(x= x+y (b) x" =0 
youaxny yi=0 


Chapter 20 


Optimal Control Theory 





At the end of Chap. 13, we referred to dynamic optimization as a type of problem we were 
not ready to tackle because we did not yet have the tools of dynamic analysis such as 
differential equations. Now that wo have acquired such tools, we can finally try a taste of 
dynamic optimization. 

The classical approach to dynamic optimization is called the calculus of variations. In 
the later development of this methodology, however, a more powerful approach known as 
aptimal control theory has, for the most part, supplanted the calculus of variations. For this 
reason, we shall, in this chapter, confine our attention to optimal control theory, explaining 
its basic nature, introducing the major solution tool called the maximum principle, and 
illustrating its use in some clementary economic models." 


20.1 The Nature of Optimal Control 


In static optimization, the task is to find a single value for each choice variable, such that a 
stated objective function will be maximized or minimized, as the case may be. Such a prob- 
lem is devoid of a time dimension, In contrast, time enters explicitly and prominently in a 
dynamic optimization problem. In such a problem, we will always have in mind a planning 
period, say from an initial time ¢ = 0 to a terminal time t = 7. and try to find the best 
course of action to take during that entire period. Thus the solution for any variable will 
take the form of not a single value, but a complete time path. 

Suppose the problem is one of profit maximization over a time period. At any point of 
time f, we have to choose the value of some control variable, u(t), which will then affect 
the value of some state variable, y(1), via a so-called equation of motion. In turn, (2) will 
determine the profit x(¢). Since our objective is to maximize the profit over the entire pe- 
riod, the objective function should take the form of a definite integral of 7 from ¢ = 0 to 
t = T. To be complete, the problem also specifies the initial valuc of the state variable y. 





T For a more complete treatment of optimal control theory (as well as “calculus of variations"), the 
student is referred to Elements of Dynamic Optimization by Alpha C. Chiang, McGraw-Hill, New York, 
1992, now published by Waveland Press, Inc,, Prospect Heights, Illinois. This chapter draws heavily 
from material in this cited book. 


631 


632 Part Five Dynamic Analysis 


y(0), and the terminal value of y, »(7), or alternatively, the range of values that y(7’) is 
allowed to take. 

Taking into account the preceding, we can state the simplest problem of optimal control 
as: 


r 
Maximize [ F(t y, ude 
0 


dy 





subject to 77 =y = f(t,y,u) (20.1) 
y(0)=4 p(T) free 
and u(t)}eU forall € [0,7] 


The first line of (20.1), the objective function, is an integral whose integrand F(t, y, «) 
stipulates how the choice of the control variable u at time f, along with the resulting y at 
time t, determines our object of maximization at ¢. The second line is the equation of mo- 
tion for the state variable y. What this equation does is to provide the mechanism whereby 
our choice of control variable # can be translated into a specific pattern of movement of the 
state variable p. Normally, the linkage between uv and y can be adequately described by a 
first-order differential equation y’ = f(t, v, w). However, if it happens that the pattern of 
change of the state variable requires a second-order differential equation, then we must 
transform this equation into a pair of first-order differential equations. In that case an addi- 
tional state variable will be introduced. Both the integrand F and the equation of motion are 
assumed to be continuous in all their arguments and possess continuous first-order partial 
derivatives with respect to the state variable y and the time variable ¢, but not necessarily the 
control variable w. In the third line, we indicate that the initial state, the value of y at? = 0, 
is aconstant 4, but the terminal state y(7) is left unrestricted. Finally, the fourth line indi- 
cates that the permissible choices of u are limited to a contral region U. It may happen, of 
course, that u(7) is not restricted. 





Illustration: A Simple Macroeconomic Model 
Consider an economy that produces output Y using capital K and a fixed amount of labor L, 
according to the production function 


Y=Y(K,L) 


Further, output is used either for consumption C or for investment /. If we ignore the prob- 
lem of depreciation, then 


_ dk 
dt 


In other words, investment is the change in capital stock over time. Thus we can also write 
investment as 


_ dK 
~ di 


which gives us a first-order differential equation in the variable K. 


T=Y-C=Y(K,L)-C 


Chapter 20 Optimal Control Theary 633 


If our objective is to maximize some form of social utility over a fixed planning period, 
then the problem becomes 


T 
Maximize [ u(C)at 
0 


subject to a =Y¥(K,L)-C (20.2) 
and K()=Ky  K(P) = Kr 


where Kg and Ky are the initial value and terminal (target) value of K. Note that in (20.2), 
the terminal state is a fixed value, not left free as in (20.1). Tlere C serves as the control 
variable and X is the state variable. The problem is to choose the optimal control path C(1) 
such that its impact on output Y and capital K, and the repercussions therefrom upon C 
itself, will together maximize the aggregate utility over the planning period. 


Pontryagin’s Maximum Principle 

The key to optimal control theory is a first-order necessary condition known as the maxi- 
mum principle. The statement of the maximum principle involves an approach that is akin 
to the Lagrangian function and the Lagrangian multiplier variable. For optimal control 
problems, these are known as the Hamilronian fimetion and costate variable, concepts we 
will now develop. 


The Hamiltonian 

Tn (20,1), there are three variables: time ¢, the state variable y, and the control variable u. We 

now introduce a new variable known as the costate variable and denoted by 4(¢). Like the 

Lagrange multiplier, the costate variable measures the shadow price of the state variable. 
The costate variable is introduced into the optimal control problem via a Hamiltonian 

function (or Hamiltonian, for short). The Hamiltonian is detined as 


A(t, y, uA} Ftv, w) FAG L,Y) (20.3) 
where 1 denotes the Hamiltonian and is a function of four variables: 1, v, u, and 4. 


The Maximum Principle 
The maximum principle—the main tool for solving problems of optimal contral-- is so 
named because, as a first-order necessary condition it requires us to choose y so as to max- 
imize the Hamiltonian H at every point of time. 

Since, aside from the control variable, u, H involves the state variable y and costate 
variable A, the statement of the maximum principle also stipulates how y and 4 should 
change over time, via an equation of motion for the state variable y (state equation for 


* The term “maximum principle” is attributed to L. $. Pontryagin and his associates, and is often 
referred to as Pontryagin’s maximum principle. See The Mathematical Theory of Optimal Control 
Processes by L. §. Pontryagin, V. G. Boltyanskii, R. ¥. Gamkrelidze, and E. F. Mishchenko, Interscience, 
New York, 1962 (translated by K. N. Trirogoff). 


634 Part Five Dynamic Anatysis 


short) as well as an equation of motion for the costate variable 4. (costate equation for 
short). The state equation always comes as part of the problem statement itself, as in 
the second equation in (20.1). But in the view that (20,3) implies 0#/a4 = f(t. y, w), the 
maximum principle describes the state equation 


: , OH 
y= fy way = = (20.4) 
oA 
In contrast, 4 does not appear in the problem statement (20,1) and its equation of motion 
enters into the picture purely as an optimization condition, The costate equation is 


wile) a8 
i (= a)= a (20.5) 


Note that both equations of motion are stated in terms of the partial derivatives of 7, sug- 
gesting some symmetry, but there is a negative sign attached to 4/7/3y in (20,5). 

Equations {20.4} and (20.5) constitute a system of two differential equations. Thus we 
need two boundary conditions to definitize the two arbitrary constants that will arise in 
the process of solution. If both the initial state y(0) and the terminal state y(7) are fixed, 
then these specifications can be used to definitize the constants. But if, as m problem {20.1}, 
the terminal state is not fixed, then something called a transversality condition must be 
included as part of the maximum principle, to fill the gap left by the missing boundary 
condition. 

Summing up the preceding, we can state the various components of the maximum prin- 
ciple for problem (20.1) as follows: 


@ A(tyu" A> A(t.y,u,A) forall se [0, 7] 





| én ; 
(i) y= ry (state equation) 
(20.6) 
aa 4t OH . 
(ij) V= ay (costate equation) 
y 
(iv) MT)=0 (transversality condition) 


Condition / in (20.6) states that at every time ¢ the valuc of u(#), the optimal control, 
must be chosen so as to maximize the valuc of the Hamiltonian over all admissible values 
of u{¢). In the case where the Hamiltonian is differentiable with respect to u and yields an 
interior solution, Condition i can be replaced by 


oH 


—=0 
ou 


However, if the control region is a closed set, then boundary solutions are possible and 
dH/du = O may not apply. In fact, the maximum principle docs not even tequite the Hamil- 
tonian to be differentiable with respect to uw. 

Conditions ii and iif of the maximum principle, y’ = ¢H/@A and A' = —dH/dy, give us 
two equations of motion, referred to as the Hamiltonian system for the given problem. 
Condition i, 4(7) = 0, is the transversality condition appropriate for the free-terminal- 
state problem only, 


Example 1 


FIGURE 20.1 


Chapter 20 Optimal Control Theory 635 


To illustrate the use of the maximum principle, let us first consider a simple noneconomic 
example—that of finding the shortest path from a given point A to a given straight line. In 
Fig. 20.1, we have plotted the point A on the vertical axis in the ty plane, and drawn the 
straight line as a vertical one at t = T. Three (out of an infinite number of) admissible paths 
are shown, each with a different length, The length of any path is the aggregate of smal! 
path segments, each of which can be considered as the hypotenuse (not drawn) of a trian- 
gle formed by smail movements dt and dy. Denoting the hypotenuse by dh, we have, by 
Pythagoras’s theorem, 


dh? = dt? + dy? 
Dividing both sides by dt? and taking the square root yields 
dh dy\?] 
= a =l4+(y'yy? . 
a (4) +07) (20.7) 
The total length of the path can then be found by integrating (20.7) with respect to t, from 
t=Otot=T. If welet y’ =u be the contro! variable, (20.7) can be expressed as 


dh ; 
ty? (20.7) 


Ta minimize the integral of (20.7’) is, of course, equivalent to maximizing the negative of 
(20.7'). Thus the shortest-path problem is: 
T 
Maximize [ —C 4?) at 
0 


subjectto = y’=u 


and yOy=A — (T) free 
The Hamiltonian for the problem is, by (20.3), 
A= 4 Py"? 430 














636 Part Five Dynamic Analvsiy 


Example 2 


Since H is differentiable in u, and wis unrestricted, the following first-order condition can be 
used to maximize H: 


= = sa +) Pu +420 
or uD = acl — a2? 


Checking the second-order condition, we find that 
aH ’ 
se tlt ay 3? <0 


which verifies that the solution to u(t} does maximize the Hamiltonian. Since u(f) is a func- 
tion of 4, we need a solution to the costate variable. From the first-order conditions, the 


equation of motion for the costate variable is 
aH 

‘= =0 
ay 


since H is independent of y. Thus, 4 is a constant. To definitize this constant, we can make 
use of the transversality condition A(T) = 0. Since A can take only a single value, now 
known to be zero, we actually have 4(¢) = 0 for all t. Thus we can write 


A H=0 — forallt € (0, T] 
It follows that the optimal control is 
ajax -ayy'? =0 


Finally, using the equation of motion for the state variable, we see that 


y =u=0 
or ¥(QH=@ — (aconstant) 
Incorporating the initial condition 
y(0)= A 


we can conclude that c) = A, and write 

Y@®=A — forallt 
In Fig. 20.1, this path is the line AB. The shortest path is found to be a straight line with a 
zera slope. 


Find the optimal control path that will 
1 
Maximize [ (y—wyat 
9 
subjectto oy’ su 
and y(0) = 5 y(1) free 


This problem is in the format of (20.1), except that u is unrestricted. 
The Hamiltonian for this problem, 


Hay-w thu 


Example 3 


Chapter 20 Optimal Control Theory 637 


is concave in u, and wis unrestricted, so we can maximize H by applying the first-order con- 
dition {also sufficient because of concavity of H): 


a =-2ut+h=0 
which gives us 
A aoa 
KO = 5 or y =5 (20.8) 
The equation of motion for A is 
oH 
i =-— =-1 20.8" 
ay (208) 


The last two equations constitute the differential-equatian system for this problem. 
We can first solve for 4 by straight integration of (20.8’) to get 


A()=c—t — (carbitrary) 


Moreover, by the transversality condition in (20.6), we must have 4(1) = 0. Setting t = 1 in 
the last equation yields c, = 1. Thus the optimal costate path is 


a(=l-t 
It follows that y’ = va ~ t), by (20.8), and by integration, 


VW) = ue - te +2 (cz arbitrary) 


The arbitrary constant can be definitized by using the initial condition y(0) = 5. Setting 
t = 0 in the preceding equation, we get 5 = y(0) = cz. Thus the optimal path for the state 
variable is 


113 
¥O=zt- gt ts 
and the corresponding optimal control path is 


wo=ha-0 


Find the optimal control path that will 
2 
Maximize [ (2y — 3u) dt 
0 
subjectto = yy’ =y+u 
y0)=4 2) free 

and u(t) ¢ [0, 2] 

The fact that the control variable is restricted to the closed set [0, 2] gives rise to the possi- 


bility of boundary solutions. 
The Hamiltonian function 


H=2y-3utiy+ = (24+ 0¥+C-3)u 


638 Part Five Dynamic Analysis 


FIGURE 20.2 


H 


Max H 





<3) 





1 








- 

2 

Contral region +| 
@fu52) 


is linear in u. If we plot H against wv in the uH plane, we get a straight line with slope 
3H /au =, — 3, which is positive if 4 > 3 (Line 1), but negative if 4 < 3 (Line 2}, as illus- 
trated in Fig, 20.2. if at any time A exceeds 3, then the maximum H occurs at the upper 
boundary of the control region and we must choose uv = 2. If, on the other hand, 2. falls 
below 3, then in order to maximize H, we must choose u = 0, In short, u*(t) depends on 
2.42) as follows: 


ary 12 . 7 > 
v@= | | if at) | ° | 3 (20.9) 
Thus, it is critical to find A(). To do this, we start from the costate equation 
yea or 4+A=-2 
ay 


The general solution of this equation is 
M=Ae'—2 — [by (15.5) 


where A is an arbitrary constant. By using the transversality condition A(T) = 4(2) = 0, we 
find that A = 2e”. Thus the definite solution for is 


w(t) = 2et 2 (20.19) 


which is a decreasing function of ¢, falling steadily from the initial value 4*(0) = 2e2-2= 
12.778 to a terminal value 4*(2) = 2e — 2 = 0. This means that 4* must pass through the 
point 4 = 3 at some critical time c, when the optimal u has to be switched from uf = 2 to 
w=. 

To find this critical time t, we set 4*(7) = 3 in (20.10): 


Ba ()=227-2 of et =3=25 


Taking the natural log of both sides, we get 
Inet =In2.5 0 or «= 2-7 =In2.5 


Chapter 20 Optimal Control Theary 639 


Thus 
r=2-In2.5=1.084 — (approx.) 
and the optimal control turns out to consist of two phases in the time interval (0, 2]: 


Phase 1: u*[0, r) = 2 Phase 2: u*[r, 2] = 0 


20.2 Alternative Terminal Conditions 





What happens to the maximum principle when the terminal condition is different from the 
ane in (20.1)? In (20.1), we face a vertical terminal line—with a fixed terminal time but 
unrestricted terminal state as illustrated in Fig, 20,1, The maximum principle for the max- 
tmization problem requires that 


() Alt, y,w*,4)> H(t, y,4, a) for allt € [0, 7] 
aH 





with the transversality condition 
(iv) X(T) =0 


With alternative terminal conditions, Conditions i, ii, and éé will remain the same, but Con- 
dition iv (the transversality condition) must be duly modified. 


Fixed Terminal Point 
If the terminal point is fixed so that the terminal condition is p(T) = yr with both 7'and py 
given, then the terminal condition itself should provide the information to definitize one 
constant, In this case, no transversality condition is needed. 


Horizontal Terminal Line 
Suppose that the terminal state is fixed at a given larget level y but the terminal time 7 is 
free, so that we have the flexibility to reach the target in a hurry or at a leisurely pace. We 
then have a horizontal terminal line as illustrated in Fig. 20.3a, which allows us to choose 
between 11, 7), 73, or other terminal times to reach the target level of y. For this case, the 
transversality condition is a restriction on the Hamiltonian (rather than the costate variable) 
att= Tf: 





Mer =0 (20.11) 


Truncated Vertical Terminal Line 

Tf we have a fixed terminal time 7, and the terminal state is free but subject to the proviso 
that yr > Vin, Where nin denotes a given minimum permissible level of y, we face a trun- 
cated vertical terminal line, as illustrated in Fig. 20.3b. 


640 Part Five Dynamic Analysis 


FIGURE 20,3 


Yin 








(a) (b) 


Jr oot 





0 Tt 


mix 


13) 
The transversality condition for this case can be stated like the complementary- 
slackness condition found in the Kuhn-Tucker conditions: 


MT)Z0 0 yr = ¥min Wr — Yuin ACP) =0 (20.12) 





The practical approach for solving this type of problem is to first try A(T) = 0 as the 
transversality condition and test if the resulting vp satisfies the restriction v7 > Ynin. Ifso, 
the problem is solved. If not, then treat the problem as a given terminal point problem with 
as the terminal state. 


Yamin 


Truncated Horizontal Terminal Line 

When the terminal state is fixed at yy and the terminal time is free but subject to the re- 
striction T < Tnx, Where Tax denotes the latest permissible time (a deadline) to reach 
the given py, we face a truncated horizontal terminal line as illustrated in Pig. 20.3c. The 
transversality condition becomes 





Hoty 20 TS Tay (TF — Tre) Paty, = 9 (20.13) 
This again appears in the format of the complementary-slackness condition. 
The practical approach to solving this type of problem is to try H=z,,, = 0 first. If the 


resulting solution value is 7* < Tmax, then the problem is solved. If-not, then we must take 


Example 1 


Chapter 20 Qptintal Contral Theory 641 
Tmax a5 a fixed terminal time which, together with the given y;, defines a fixed end point, 
and solve the problem as a fixed-end-point problem. 
In the problem 
1 
Maximize [ (y-w) dt 
0 
subjectto = y’=u 
and yO)=2  l)=a 
the terminal point is fixed, even though (1) is assigned a parametric rather than numerical 


value here, 
The Hamiltonian function 


Hay-wtaw 
is concave in u, so we can set dH /du = O to maximize H: 
oH 
= -2u+i=0 
du 
Thus 
A 
uss 
2 


which shows that in order to solve for u(t), we need to solve for 4(£) first. 
The two equations of motion are 





Direct integration of the last equation yields 
Asc —f (cy arbitrary) 


which implies that 
==9- 1, 
yasa~s 


Again, by direct integration, we find that 


yxO= Si te +¢2 {cz arbitrary) 


To definitize the two arbitrary constants, we make use of the initial condition y(0) = 2, and 
the terminal condition y(1) = a. Setting t= 0 and t= 1, successively, in the preceding 
equation, we obtain 


q 1 
2= 0) =a a=\"=F-Z4a 


Thus, ¢ = 2, and q =2a—-5. 


642 Part Five Drnamic Analysis 


Therefore, we can write the optimal paths of this problem as: 
7 1 
th=(a--)t-- 42 
v(t) =(a- t= Ze + 
7 
(= 2a-=-t 


2 


. 71 
wW=a-7-5t 


Example 2 The problem 
TT T 
Maximize [ + +)dt 
0 
subjectto oy’ =u 
and y0)4 fT) =5—T free 


exemplifies the case of horizontal terminal line where the terminal state is fixed but the time 
of arrival at the target level of y is unrestricted. In fact, it is one of our tasks to solve for the 
optimal value of 7. 

Since the Hamiltonian 


H=—P udu 


is concave in u, we can again maximize H by using the first-order condition 


oH =-2ut+,=0 
au 
which gives us 
a 
= 20.14) 
ues (20.14) 


The concavity of H makes it unnecessary to check the second-order condition, but if we 
wish, it is easy to check that 974 /au? = —2 < 0, sufficient for a maximum of H. 
The equation of motion for A is 
a 
Ve-- H 0 
ay 
which implies that 4 is a constant, But we cannot yet determine its exact value at this point. 
Turning to the equation of motion for y, 


yau=% by @ota)] 


we can obtain, by direct integration, 
VQ = gite (20.15) 


Since y(0) =4, we see that c=4. Furthermore, the transversality condition (20.11) 
requires that 


a2 a? 
455-14 


, pao (by @20.14)] 





Chapter 20 Optimal Control Theory 643 


Solving the preceding equation for 7, and taking the positive square root, we get 


Tat . 
5 (20.16) 


Since A is constant, so is 7. We try now to find its exact value. 
Applying the terminal-state specification y(T) = 5 to (20.15), and recalling that c = 4, 
we get 


ey 
Ta 2T+4=5 
HT) = 504 
In view of (20.16), the last equation can be rewritten as T* = 1. Thus, by taking the square 
root, we can determine the optimal arrival time to be 
Tal {negative root unacceptable) 


From this, we can readily deduce that 


wW=2T*=2 — [by (20.16)] 
v= 5 =1 — [by(20.14)] 
y(j=te4 [by (20.15)] 


The last result shows that, in this example, the optimal y path is a straight line going from 
the given initial point to the horizontal terminal line. 





EXERCISE 20.2 
Find the optimal paths of the control, state, and costate variables that will 


1 
1, Maximize [ (y~w)dt 
fy 
subject to a7 
and y0)=2 (i) free 


8 
2. Maximize [ 6ydt 
0 
subjectto yy’ = y+u 
y0)=10 (8) free 
and u(t) € [0, 2] 
T 
3. Maximize [ ~(au + bu’) dt 
a 
subject to ysoy-u 
and yO) = yo y(O free 
T 
4. Maximize [ (yu-u? — dt 
0 


subject to yo=u 
and yO) = yo y(t) free 


644 Part Five Dynamic Analysis 


5. Maximize 


subject to 
and 


6. Maximize 


subject to 


and 


7. Maximize 


subject to 
and 


8, Maximize 


subject to 
and 


9. Maximize 


subject to 
and 


1, 
—sut dt 
[3 


you 
y(0) = 10 
4 


y(20) = 0 


y(0) =5 
Osu) <2 


(4) = 300 


y(Q) =0 


2 
| (y +ut—u)dt 
1 


, 





y 
WYs3 9 y2ja4 


[e- 3u ~ au?) dt 
0 


Ysuty 


O)='5 {2} free 


20.3 Autonomous Problems 





In the general control problem framework, the variable ¢ can enter the objective function 
and state equation directly. The general specification 


r 
Maximize [ F(t, yujat 
0 


y= fltyw 


and boundary conditions 


subject to 


where ¢ explicitly enters into F and f means the date matters, That is, the valuc generated by 
the activity u(2) depends not only on the level, but also on exactly when this activity takes 


place. 


Problems in which 7 is absent from the objective function and state equation such as 


r 
Maximize [ Fly. uw) dt 
oO 


v= f(y u) 


and boundary conditions 


subject to 


Chapter 20 Optimal Control Theory 645 


are called autonomous problems. In such problems, since the Hamiltonian 
Hs Fly, uw) +Aafru) 


does not contain f as an argument, the equations of motion are easier to solve; moreover, 
they are amenable to the use of phase-diagram analysis. 

In still other cases, in an otherwise autonomous problem, time f enters into the picture 
as part of the discount factor e~', but nowhere else, so that the objective function takes the 
form of 


r 
[ Gly, we" dt 
ic) 


Strictly speaking, this problem is nonautonomous. However, it is easy to convert the prob- 
lem into an autonomous one by employing the so-called current-value Hamiltonian, 
defined as: 
A, = He" = Gly, uw) + wfly, 4) (20.17) 
where 
w= he” (20.18) 


is the current-value Lagrange multiplier. By focusing on the current (undiscounted) value, 
we are able to eliminate ¢ from the original Hamiltonian. 
Using HH; in lieu of H, we must revise the maximum principle to: 
@ Hy, uw, 2 Ady,u, ge) — forallée [0,7] 








OAc 
Gi) y= on 
aH, 
(ii) w= —— +e (20.19) 
by 
(iv) u(T) =0 (for vertical terminal line} 


or [H.]a7 = 0 (for horizontal terminal line) 


20.4 Economic Applications 





Lifetime Utility Maximization 
Suppose a consumer has the utility function U(C(t)), where C(¢) is consumption at time ¢. 
The consumer’s utility function is concave, and has the following properties: 


Usd U" <0 
The consumer is also endowed with an initial stock of wealth, or capital, Ky, with income 
stream derived from the stock of capital according to the following: 
York 
where r is the market rate of imterest. The consumer uses the income to purchase C. In ad- 


dition, the consumer can consume the capital stock. Any income not consumed is added to 
the capital stock as investment. Thus, 


Ki= 





=Y¥-Ce=rk-C 


646 Part Five Dynamic Analysis 


The consumer's lifetime utility maximization problem is to 
T 
Maximize u(c(ye" at 
0 


subjectto = K’ =r K(t)- C(t) 
and KQ)=Ky  K(T)>0 
where 4 is the consumer's personal rate of time preference (6 > 0). It is assumed that 


C(t) > Oand K(¢) > 0 forall 4 
The Hamiltonian is 


H = UCC(D)e* +A) IF KW) - CU)] 
where C is the control variable, and Kis the state variable. Since U(C) is concave, and the 


constraint is linear in C, we know that the Hamiltonian is concave and the maximization of 
H can be achieved by simply setting 9H /9C = 0. Thus we have 


OH 


OF U's i ye 
50 TU Ce A= 0 (20.20) 
Kl =rk()—C(d) (20.20') 
1 OF - 
a (20.20") 


Equation (20.20) states that the discounted marginal utility should be equated to the pre- 
sent shadow price of an additional unit of capital. Differentiating (20,20) with respect to ¢, 
we get 


u"cyc'e® — sU(C)e™ = (20.21) 
In view of (20,20) and (20.20") we have 
Msarh = -rU(Cje™ 
which can be substituted into (20.21) to yield 
UC) (he = BU (C)e™ = =r U(C)e™ 
a5 


or, after canceling the common factor ¢ 


uc) 
uCtt)y 


and rearranging, 
Ci)=r-8 


Since U’ > 0 and U” < 0, the sign of the derivative C’(¢) has to be the same as (r — 5). 
Therefore, ifr > 8, the optimal consumption will rise over time; ifr < 4, the optimal con- 
sumption will decline over time. 

Solving (20.20") directly gives us 


Mt) = dye” 
where Ag > 0 is the constant of integration. Combining this with (20.20) gives us 


U(C(N) = Ae = Age 


Chapter 20. Oprimal Consol Theory 647 


which shows that the marginal utility of consumption will optimally decrease over time if 
r > 8, but increase over time ifr < 6. 

Since the terminal condition K(7') > 0 identifies the present problem as one with a 
truncated vertical terminal line, the appropriate transversality condition is, by (20.12), 


MPVZ0 K(T)=0 K(T)MT)=0 
The key condition is the complementary-slackness stipulation, which means that either the 
capital stock K must be exhausted on the terminal datc, or the shadow price of capital 4 
must fall to zero on the terminal date, By assumption, U'(C} > 0, the marginal utility can 


never be zero, Therefore, the marginal value of capital cannot be zero. This implies that the 
capital stuck should optimally be exhausted by the terminal date 7 in this model. 


Exhaustible Resource 


Let s() denote a stock of an exhaustible resource and q(t) be the rate of extraction at any 
time f such that 


' 


soeg 
The extracted resource produces a final consumer good ¢ such that 
e=c(g) where co > 0,0" <0 (20,22) 


The consumption good is the sole argument in the utility function of a representative con- 
suincr with the following properties: 


U=U(e) where U'>0,U" <0 (20.22') 


The consumer wishes to maximize the utility function over a given interval [0, 7]. Since 
¢ isa function of q, the rate of extraction, g will serve as the control variable. Kor simplic- 
ity, we ignore the issue of discounting over time. The dynamic problem is then to choose 
the optimal extraction rate that maximizes the utility function subject only to a nonnegativ- 
ity constraint on the state variable s(¢), the stock of the exhaustible resource. The formula- 
tion is 


r 
Maximize [ Ule(q)) dt 


(20.23) 
subject to s 


and s=%s (TIO 


where so and Tare given. 
The Hamiltonian for the problem is 


H = U(c(q)) ~ Ag 





Since is concave in g by model specifications on the U/(c(qg)) function, we can maximize 
él by setting 0H/dg = 0: 


oH 
ra U'(e(g)e'(g) -A =O (20.24) 


The concavity of # assures us that (20,24) maximizes H, but we can casily check the 
second-order condition and confirm that 4° H/3q? is negative. 


648, Part Five Dynamic Analysis 


The maximum principle stipulates that 


aH 
--—-0 
os 
which implies that 
A(t) = co a constant (20.25) 


To determine co, we turn to the transversality conditions. Since the model specifics 
K(T) = O, it has a truncated vertical terminal line, so (20.12) applies: 


MI)=O  s(P)=0 — s(T)A(T) =O 
Tn practical applications, the initial step is to try A( 7) = 0, solve for g, and sce iff the solu- 
tion will work. Since 4(7) is a constant, to try A(T) = 0 implies A(t) = 0 for all f, and 
0H /dq in (20.24) reduces to 
Uee(g) =0 
which (in principle) can be solved for g. Since # is not an explicit argument of U or c, the 
solution path for g is constant over time: 
Gh=¢ 
Now, we check if q* satisfies the restriction s(7) > 9. Ifg* is a constant, then the equa- 
tion of motion 
sag 
can be readily integrated, yielding 
s(t) = —qt tc [c) = constant of integration] 
Using the initial condition s(0) = so yields a solution for the constant of integration 
c) = 30 
and the optimal state path is 
s(1) = 59 —g*t (20.26) 
Without specifying the functional forms for U and c, no numerical sofution can be found 
for q*. However, from the transversality conditions, we can conclude that if.s(7) > 0, then 
g* as derived in the solution is acceptable. But if s(7) < 0 for the given q*, then the cx- 
traction rate is too high and we need to find a different solution. Since the trial solution 
A(T) = 0 failed, we now take the alternative of A(7) > 0. Even in this case, though, 4 is 
still a constant by (20.25). And (20.24) can still (in principle) yield a constant, but differ- 
ent, solution value qf. [t follows that (20.26) remains valid. But this time, with A(T) > 0, 
the transversality condition (20.12) dictates that s(77) = 0, or in view of (20.26), 
So -QT=0 
Thus we can write the revised (constant) optimal rate of extraction as 
Sy 


=> 


Chapter 20 Optimal Control Theory 649 


This new solution value should represent a lower extraction rate that would not violate the 
5(T) = 0 boundary condition. 





EXERCISE 20.4 


T 
1, Maximize [ (K -a0k?-1)dt = (a > 0) 
0 


subjectto = K’ = 1 —8K (8 > 0) 
and K(Q) = Ko K(T) free 


2. Salve the following exhaustible resource problem for the optimal extraction path: 


T 
Maximize [ In(qje tat 
0 


, 


subjectto = s' = -q 
and (O)=5o (20 


20.5 Infinite Time Horizon 





In this section we introduce the problem of dynamic optimization over an infinite planning 
period. Infinite time horizon models tend to introduce complexities with respect to trans- 
versality conditions and optimal time paths that differ from those developed earlier. Rather 
than address these issues here, we shall illustrate the methodology of such models with a 
version of the neoclassical optimal growth model. 


Neoclassical Optimal Growth Model 
The standard neoclassical production function expresses output Y as a function of two in- 
puts: labor Z and capital A, Its general form is 


Y= Y¥(K,L£) 
where ¥(X, /.) is a linearly homogeneous function with the properties 
¥,>0 Yn >O Yr, <0 Vex <0 
Rewriting the production function in per capita terms yields 
y = tk) with @'{4) > 0 and o'(ky <0 
where y = Y/L and k = K/L. Total output ¥ ts allocated to consumption C or gross in- 
vestment 7, Let 5 be the rate of depreciation of the capital stock K. Then net investment or 
changes to the capital stock can be written as 
Ki={1-8K =Y¥-C-8K 
Denoting per capita consumption as ¢ = C/Z, we can write as 
i] 


i K'=y-c— bk (20.27) 


650 Part Five Dynamic Anudysis 


The right-hand side of (20.27) is in per capita terms, but the left-hand side is not. To unify, 
we note that 








dk d dL dk 
! 
a a ky ak tL . 
K ht qh) hw + i (20.28) 
If the population growth rate is’ 
dLfdt dL 
re so that 77 =aal 


then (20.28) becomes 
Ki=inL +l oor pk ahah 
Substituting this into (20.27) transforms the latter into an equation entirely in per capita 
terms: 
K=y-—c—(ntdsjk=@(k) —c-(n +8)k (20.27') 
Let U(c) be the social welfare function (expressed in per capita terms), where 
Ul(e)> 0 and = Ue) < 0 
and, to eliminate corner solutions, we also assume 
Uc) 3200) =6asc 3 0 
and ue) +0 asc > 00 
If ¢ denotes the social discount rate and the initial population is normalized to one, the 


objective function can be expressed as 


0° oe 
v -| UleyeP'Loe”™ dt -[ Uleje  ™ dt 
0 0 


o 
= [ Ulcje dt wheter =p—n 
0 


In this version of the neoclassical optimal growth model, utility is weighted by a population 
that grows continuously at a rate of n. However, ifr = 9 — n > 0, then the model is math- 
ematically no different from one without population weights but with a positive discount 
tate r. 

The optimal growth problem can now be stated as. 


oc 
Maximize [ Uleje dt 


fy 

subject to Ka O(ky)~c-(n tye (20,29) 
K(0) = ky 

and 0 <e(t) < o(k) 


where & is the state variable and ¢ is the control variable, 


"In this model we assume labor force and population to be one and the same. 


Chapter 20. Optimat Control Theory 651 


The Hamiltonian for the problem is 
H = U(ce" + AGU) — 6 —(n + 8)K] 
Since 7 is concave in ¢, the maximum of 7 corresponds to an interior solution in the con- 
trol region [0 < ¢ < f(4)], and therefore we can find the maximum of # from 
aut 
de 
or Ue) = ae” (20.30) 





=U oe" -i =0 


The economic interpretation of (20.30) is thal, along the optimal path, the marginal utility 
of per capita consumption should equal the shadow price of capital (2) weighted by e”". 
Checking second-order conditions, we find 
BH i 
3a = U"(oe" <0 
Therefore, the Hamiltonian is maximized. 
From the maximum principle, we have two equations of motion 


B= = phy e- (tk 
1 OH 
and Vea AL@'(K) — (# +.8)] 


The two equations of motion combined with the U'(c) = Ae” should in principle define 
a solution for c, &, A. However, at this level of generality we are unable to do more than un- 
dettake qualitative analysis of the model. Anything more would require specific forms of 
both the utility and production functions. 


The Current-Value Hamiltonian 

Since the preceding model is an example of an autonomous problem (¢ is not a separate 
argument in the utility function or state equation but appears only in the discount factor), 
we may use the current-value Hamiltonian written as 


H. = He™ =U(e) + w[¢(k) — ec — (n + 8)k] [see 20.17)] 











where jt = Ae", 
The maximum principle calls for 
aH, 
te =U(c)-#=0 of p=Ue) (20.31) 
aH, 
Kos — = bh) —c-—(n + 8k (20.31) 
op 
dH, , 
Ha tres nb) — (a + 8) trp 
= —alo'(k) —(n +8 +r)] (20.31") 


Equations (20.31) and (20.31) constitute an autonomous differential equation system. 
This makes possible 4 qualitative analysis by phase diagram. 


652 Part Five Dynamic Analysis 


FIGURE 20.4 


Constructing a Phase Diagram 

The variables in the differential equations (20.31’) and (20.31”) are & and 42. Since (20.31) 

involves a function of c, namely U’(c), rather than the plain ¢ itself, it would be simpler to 

construct a phase diagram in the ke space rather than the kz space. To do this, we shall try 

to eliminate j. Since p = U'(c), by (20.31), differentiation with respect to f gives us 
wave 

Substituting these expressions for yz. and j2’ into (20.31") yields 


ue) 
~ Ue) 





[eo -(at+s+r)] 


which is a differential equation in c. We now have the autonomous differential equation 
system 





Ka g(k)-c—(n + Ok (20.31') 
and (= TO gm 540) (20.32) 
oe) 


To construct the phase diagram in the kc space, we first draw the k’ = 0 and c’ =0 
curves which are defined by 


c=o(O—(nt Hk (Kk =0) (20.33) 
and O()aanth+r {c' =0) (20,34) 


These two curves are illustrated in Fig. 20.4. The equation for the &’ = 0 curve, (20.33), has 
the same structure as the fundamental equation of the Solow growth model, (15.30). Thus 
the k’ = 0 curve has the same general shape as the one in Fig. 15.5b. The ¢’ = 0 curve, on 
the other hand, plots as a vertical line because given the model specifications ¢’(k) > 0 and 
#'(k) <0, O(4) is associated with an upward-sloping concave curve, with a different 
slope at every point on the curve, so that only a unique value of k can satisfy (20.34), The 
intersection of the two curves at point Z determines the intertemporal equilibrium values of 


c 


[o'(k) =n + 84+ 7} 
=0 











fe = ok) — (a + OK] 
K=0 


FIGURE 20.5 


Chapter 20 Optimal Control Theory 653 


k and ¢, because at point £, neither & nor ¢ will change in value over time, resulting in a 
steady state, We could label these values as and & for intertemporal equilibrium values, 
but we shall label them as £* and c* instead, because they also represent the equilibrium 
values for optimal growth. 


Analyzing the Phase Diagram 
The intersection point £ in Fig. 20.4 gives us a unique steady state. But what happens if we 
are initially at some point other than £? Returning (o our system of first-order differential 
equations (20.31') and (20.32), we can deduce that 

aK! dc UN cy 


—=-l1<0 and = 


ac me Ue” (a) <0 





Since dk’ /4c < 0, all the points below the &° = 0 curve are characterized by k’ > 0 and all 
the points above the curve by k’ < 0. Similarly, since dc’ /d& < 0, all the points to the left 
of the c’ = 0 line are characterized by c’ > 0 and all the points to the right of the line by 
c" < 0. Thus the 4’ = 0 curve and the c’ = 0 line divide the phase space into four regions, 
each with its own distinct pairing of signs of c’ and &’. These are reflected in Fig. 20.5 by 
the right-angled directional arrows in each region. 

The streamlines that follow the directional arrows in each region tell us that the steady 
state at point £ is a saddle point. 1f we have an initial point that lies on one of the two sta- 
ble branches of the saddle point, the dynamics of the system will lead us to point £, But any 
initial point that does not lie on a stable branch will make us either skirt around point £, 
never reaching it, or move steadily away from it. If we follow the streamlines of the latter 
instances, we will eventually (as t — 00) end up either with & = 0 (exhaustion of capital) 
orc = 0 (per capita consumption dwindling to zero}—both of which are economically un- 
acceptable. Thus, the only viable alternative is to choose a (4, ¢) pair so as to locate our 
economy on a stable branch—a “yellow brick road,” so to speak—that will take us to the 
steady state at £. We have not explicitly talked about the transversality condition, but if we 
had, it would have guided us to the steady state at £, where the per capita consumption can 
be maintained at a constant level ever after, 





c 


(eWen tote 


cf =0 “| 


* 





7 


[o= (6) Or 1 8K) 











v=0 


654 Part Five Dynamic Analysis 


20.6 Limitations of Dynamic Analysis 





The static analysis presented in Part 2 of this volume dealt only with the question of what 
the equilibrium position will be under certain given conditions ofa model. The major query 
was: What values of the variables, if attained, will tend to perpetuate themselves? But the 
attainability of the equilibrium position was taken for granted. When we proceeded to the 
realm of comparative statics, in Part 3, the contral question shifted to a more interesting 
problem: How will the equilibrium position shift in response to a certain change in a para- 
meter? But the attainability aspect was again brushed aside. [t was not until we reached the 
dynamic analysis in Part 5 that we looked the question of attainability squarely in the eye. 
Here we specifically ask: If initially we are away from an equilibrium position—say. 
because of a recent discquilibrating parameter change—will the various forces in the 
model tend to steer us toward the new equilibrium position? Furthermore, in a dynamic 
analysis, we also learn the particular character of the path (whether steady, fluctuating, or 
oscillatory) the variable will follow on its way to the equilibrium (if at all). The significance 
of dynamic analysis should therefore be self-evident. 

However, in concluding its discussion, we should also take cognizance of the limitations 
of dynamic analysis. For one thing, to make the analysis manageable, dynamic models are 
often formulated in terms of linear equations. While simplicity may thereby be gained, the 
assumption of linearity will in many cases entail a considerable sacritice of realism. Since 
atime path which is germane to a linear model may not always approximate that of a non- 
linear counterpart, as we have seen in the prige-cciling example in Sec. 17.6, care must be 
exercised in the interpretation and application of the results of linear dynamic models. In 
this connection, however, the qualitative-graphic approach may perform an extremely valu- 
able service, because under quite general conditions it can enable us to incorporate nonlin- 
earity into a model without adding undue complexity to the analysis. 

Another shortcoming usually found in dynamic economic models is the use of constant 
coefficients in differential or difference equations, Inasmuch as the primary role of the 
coefficients is to specify the parameters of the model, the constancy of coefficients—again 
assumed for the sake of mathematical manageability -essentially serves to “freeze” the eco- 
nomic environment of the problem under investigation. In other words, it means that the en- 
dogenous adjustment of the model is being studied in a sort of economic vacuum, such that 
no exogenous factors are allowed to intrude. In certain cases, of course, this problem may not 
be too serious, because many cconomic parameters do tend to stay relatively constant over 
long periods of time. And in some other cases, we may be able to undertake a comparative- 
dynamic type of analysis, to sce how the time path of a variable will be affected by a change 
in certain parameters. Nevertheless, when we are interpreting a time path that extends into the 
distant future, we should always be careful not to be overconfident about the validity of the 
path in its more remote stretches, if simplifying assumptions of constancy have been made. 

You realize, of course, that to point out its limitations as we have done here is by no 
means intended to disparage dynamic analysis as such. Indeed, it will be recalled that cach 
type of analysis hitherto presented has been shown to have its own brand of limitations, As 
long as it is duly interpreted and properly applied, therefore, dynamic analysis—like any 
other type of analysis—can play an important part in the study of economic phenomena. In 
particular, the techniques of dynamic analysis have enabled us to extend the study of opti- 
mization into the realm of dynamic optimization in this chapter, in which the solution we 
seek is no longer a static optimum state, but an entire optimal time path. 











The Greek Alphabet 


A a 
B B 
r y 
A 8 
E é 
zZ g 
H n 
9 g 
I t 
K 

A 

M he 
N v 
5 g 
10) 7 
Ti at 
P p 
= g 
T Tv 
Y v 
® $ (or @) 
x x 
w y 
Q wo 


alpha 
beta 
gamma 
delta 
epsilon 
gela 
Cla 
theta 
iota 
kappa 
lambda 
mu 

nu 

xi 
omicron 
pi 

rho 
sigma 
tau 
upsilon 
phi 

chi 

psi 
omega 


655 


656 


Mathematical Symbols 


L. Sets 
aes 
bes 
Scr 
TOS 
AUB 
ANB 
S$ 
{ ors 
{a,b,c} 
{x | x has property P} 
min{a, b,c} 
R 
R 
R 
(y) 
(,¥, 2) 
(a, 8) 
Ia, 6] 


2. Matrices and Determinants 


A’ or A? 
al 

\4| 

WI 

|| 

H| 
r(A) 
trd 


ais an element of (belongs to) set S 

bis not an element of set S 

set Sis a subset of (is contained in) set 7 
set T includes set $ 

the union of set 4 and set B 

the intersection of set A and set B 
the complement of set § 

the null set (empty set) 

the set with elements a, 5, and ¢ 
the set of all objects with property P 
the smallest clement of the specified set 
the set of all real numbers 

the two-dimensional real space 

U 





¢ n-dimensional real space 
ordered pair 

ordered triple 

open interval from a to b 
closed interval trom @ to b 


the transpose of matrix A 

the inverse of matrix A 

the determinant of matrix A 
Jacobian determinant 
Hessian determinant 
bordered Hessian determinant 
the rank of matrix 4 

the trace of A 


0 
uu 


uy 


. Calculus 


Mathematical Symbols 657 


null matrix (zero matrix) 
the inner product (dot product) of vectors wv and v 
the scalar product of two vectors 


Given y = f(x), a function of a single variable x: 


im fx} 
om, 

ay 

dy 

d 
FoF) 
dy 


a orf’ (x0) 
Alar 





@ 


Given the function y = f(x), 42, -.. 


ay 
ax, 
Vf = grad f 
dy 
dx 
by 


gu 


ot fi 


the limit of f(x) as x approaches infinity 


the first differential of y 
the second differential of y 


the first derivative of the function y = f{x) 
the first derivative evaluated at x = x9 

the sccond derivative of » = f(x) 

the nth derivative of y = f(x) 

indefinite integral of f(x) 

definite integral of f(x) from x = a tox = 6 


otn): 
the partial derivative of f with respect to x; 
the gradient of f 


e total derivative of f with respect to x; 


the partial total derivative of f with respect to x, 


. Differential and Difference Equations 


._ dy 
aT 
Ay 
Ay, 


. Others 


ft 
Lx 
i=l 


the time derivative of v 





the first difference of y, 
the second difference of p, 
particular integral 
complementary function 


the sum of x; as i ranges from | ton 


658 Mathematical Symbols 


pq 
peg 
Pod 

iff 

[nt] 

nt 

log, x 

log, x or Inx 


€ 


sind 
cos8 


p only if g (p implies g) 

pifg (p is implied by g} 

pifand only ifg 

ifand only if 

the absolute value of the number m 

n factorial = n(n — D(a — 2)---(3)(2)0) 

the logarithm of x to base b 

the natural logarithm of x (to base ¢) 

the base of natural logarithms and natural 
exponential functions 

sine function of @ 

cosine function of @ 

the remainder term when the Taylor series involves 
an nth-degree polynomial 


A Short Reading List 


Abadie, J. (ed.): Nonlinear Programming, North-Holland Publishing Company, Amsterdam, 
1967. {A collection of papers on certain theoretical and computational aspects of non- 
linear programming; Chapter 2, by Abadie, deals with the Kuhn-Tucker theorem in 
relation to the constraint qualification.) 

Allen, R. G. D.: Mathematical Analysis for Economists, Macmillan & Co,, Ltd,, London, 
1938. (A clear exposition of differential and integral calculus; determinants are dis- 
cussed, but not matrices; na sct theory, and no mathematical programming.) 

___: Mathematical Economics, 2d ¢d., St. Martin’s Press, Inc., New York, 1959. (Dis- 
cusses a legion of mathematical economic models; explains linear differential and 
difference equations and matrix algebra.) 

Almon, C.: Matrix Methods in Economies, Addison-Wesley Publishing Company, Inc.. 
Reading, Mass., 1967. (Matrix methods are discussed in relation to linear-equation 
systems, input-output models, linear programming, and nonlinear programming. 
Characteristic roots and characteristic vectors are also covered.) 

Baldani, 1, J. Bradfield, and R. Turner: Mathematical Economics, The Dryden Press, 
Orlando, 1996. 

Baumol, W. 1: Economic Dynamics: An Introduction, 3d ed., The Macmillan Company, 
New York, 1970. (Part IV gives a lucid explanation of simple difference equations; 
Part V treats simultaneous difference equations; differential equations are only bricfly 
discussed.) 

Braun, M.: Differential Equations and Their Applications: An Intreduction ta Applied 
Mathematics, 4th ed., Springer-Verlag, Inc., New York, 1993. (Contains interesting 
applications of differential equations, such as the detection of art forgeries, the spread 
of epidemics, the arms race, and the disposal of nuclear waste.) 

Burmeister, and A. R, Dobell: Mathematical Theories of Economic Growth, The 
Macmillan Company, New York, 1970. (A thorough exposition of growth models of 
varying degrees of complexity.) 

Chiang, Alpha C.: Elements of Dynamic Optimization, McGraw-Hill Book Company, 
1992, now published by Waveland Press, Inc., Prospect Heights, Ill. 

Clark, Colin W.: Mathematical Bioeconomics: The Optimal Management of Renewable 
Resources, 2nd ed., John Wiley & Sons, Inc., Toronto, 1990. (A thorough explanation 
of optimal control theory and its use in both renewable and nonrenewable resources.) 








659 


660 A Short Reading List 


Coddington, E. A., and N. Levinson: Theory of Ordinary Differential Equations, 
MeGraw-Hill Book Company, New York, 1955. (A basic mathematical text on differ- 
ential equations.) 

Courant, R.: Differential and Integral Calculus (trans. E. J. McShane), Interscience 

Publishers, Inc., New York, vol. I, 2d cd., 1937, vol. II, 1936. (A classic treatise on 

calculus.) 

_, and F. John: /atroduction to Calculus and Analysis, Interscience Publishers, Inc., 
New York, vol. I, 1965, vol. II, 1974. (An updated version of the preceding title.) 
Dorfman, R., P A. Samuclson, and R. M. Solow: Linear Programming and Economic 

Analysis, McGraw-Hill Book Company, New York, 1958, (A detailed treatment of 
linear programming, game theory, and input-output analysis.) 


Franklin, J; Methods of Mathematical Economics: Linear and Nonlinear Programming, 
Fixed-Point Theorems, Springer-Verlag, Inc., New York, 1980. (A delightful presenta- 
tion of mathematical programming.) 

Frisch, R.; Maxima and Minima: Theory and Economic Applications (in collaboration with 
A. Nataf), Rand McNally & Company, Chicago, IIL, 1966. {A thorough treatment of 
extremum problems, done primarily in the classical tradition.) 

Goldberg, S.: Introduction to Difference Equations, John Wiley & Sons, Inc., New York, 
1958. (With economic applications.) 

Hadley, G.: Linear Algebra, Addison-Wesley Publishing Company, Inc., Reading, Mass,, 

1961. (Covers matrices, determinants, convex sets, etc.) 

: Linear Programming, Addison-Wesley Publishing Company, Inc,, Reading, 

Mass., 1962. (A clearly written, mathematically oriented exposition.) 

_..__: Nonlinear and Dynamic Programming, Addison-Wesley Publishing Company, Inc., 
Reading, Mass., 1964. (Covers nonlinear programming, stochastic programming, inte- 
ger programming, and dynamic programming; computational aspects are emphasized.) 

Halmos, P. R.: Naive Set Theory, D. Van Nostrand Company, Inc., Princeton, N.J., 1960. 
(An informal and hence readabic introduction to the basics of set theory.) 

Hands, D. Wade: Introductory Mathematical Economics, 2nd ed., Oxford University Press, 
New York, 2004. 

Henderson, J. M., and R. E. Quandt: Micraeconomic Theory: A Mathematical Approach, 3d 
ed., McGraw-Hill Book Company, New York, 1980. (A comprehensive mathematical 
treatment of microeconomic topics.) 

Hoy, M., J, Livernais, C. McKenna, R. Rees, and T. Stengos; Mathematics for Economics, 
2nd ¢d., The MIT Press, Cambridge, Mass. 2001. 

Intriligator, M. D.: Mathematical Optimization and Economic Theory, Prentice Hall, Inc., 
Englewood Cliffs, N.J., 1971. (A thorough discussion of optimization methods, in- 
cluding the classical techniques, linear and nonlinear programming, and dynamic 
optimization; also applications to the theorics of the consumer and the firm, general 
equilibrium and welfare economics, and theories of growth.) 

Kemeny, J. G., J. L. Snell, and G. L, Thompson: /ntroduction to Finite Mathematics, 3d ed., 
Prentice Hall, Inc., Englewood Cliffs, N.J., 1974. (Covers such topics as sets, matrices, 
probability, and linear programming.) 

Klein, Michael W.: Mathematical Methods for Economics, 2nd ed., Addison-Wesley 
Publishing Company, Inc., Reading, Mass. 2002. 

Koo, D.: Elemenis of Optimization: With Applications in Economics and Business, 
Springer-Verlag, Inc., New York, 1977. (Clear discussion of classical optimization 
methods, mathematical programming as well as optimal control theory.) 














A Short Reading List 667 


Koopmans, T. C. (ed.}: Activity Analysis of Production and Allocation, John Wiley & Sons, 
Inc., New York, 1951, reprinted by Yale University Press, 1972. (Contains a number of 
important papers on linear programming and activity analysis. 

_...! Three Essays on the State of Economic Science, McGraw-Hill Book Company, 
New York, 1957. (The first essay contains a good exposition of convex sets; the third 
essay discusses the interaction of tools and problems in cconomics.) 

Lambert, Peter J,, Advanced Mathematics for Economists: Static and Dynamic Optimiza- 
tion, Blackwell Publishers, New York, 1985. 

Leontief, W. W.: The Structure of American Economy, 1919-1939, 2d ed., Oxford Univer- 
sity Press, Fair Lawn, N.J., 1951. (The pioneering work in input-output analysis.} 
Samuelson, P. A.: Foundations of Economic Analysis, Harvard University Press, Cambridge, 

Mass., 1947. (A classic in mathematical economics, but very difficult to read.) 

Silberberg, Eugene, and Wing Suen: The Structure of Economics: A Mathematical 
Analysis, 3rd ed., McGraw-Hill Book Company, New York, 200]. (Primarily a micro- 
economic focus, this book has a strong discussion of the envelope theorem and a wide 
variety of applications.) 

Sydseter, Knut, and Peter Hammond: Essential Mathematics for Economie Analysis, 
Prentice Hall, Inc., London, 2002. 

Takayama, A.: Mathematical Economics, 2nd ed., The Dryden Press, Hinsdale, Ill, 1985. 
(Gives a comprehensive treatment of cconomic theory in mathematical terms, with 
concentration on two specific topics: competitive equilibrium and economic growth.) 

Thomas, G. B., and R. L. Finney: Calculus and Analytic Geometry, 9th ed., Addison- 
Wesley Publishing Company, Inc., Reading, Mass., 1996. (A clearly written introduc- 
tion to calculus.) 





662 


Answers to Selected 
Exercises 


Exercise 2.3 

1. (a) {x | x > 34) 

3. (a) {2,.4,6,7) ©) 2,6) @) 12} 

8, There are 16 subsets. 

9, Hint: Distinguish between the two symbols ¢ and ¢. 


Exercise 2.4 

1, (a) {(3, a), (3, 6), (6,4), (6, 8), (9, a), 9, B)} 
3. No. 

§, Range = {y |8 < y < 32} 


Exercise 2.5 

2. (a) and (6) differ in the sign of the slope measure: (a) and (c) differ in the vertical 
intercept. 

4, When negative values are permissible, quadrant 1] has to be used tov. 

5. (a) x'? 

6. (a) x® 


Exercise 3.2 
1, P* = 23, and Q* = 14% 


3. Note: In 2{a), c = 10 (not 6). 
5, Hint: b +d =0 implies d = —6. 


Exercise 3.3 

1, (a) xf = 5, and xj = 3 

3. (a) (x = 6)(x + 1)(x — 3) = 0, or x? — 8x? + 9x + 18 = 0 
5. (a) ~1,2,and3— (¢) -L, 5, and —} 


Answers to Selected Exercises 663 


Exercise 3.4 
RPra3h 0 pra3h 0 grand 





Exercise 3.5 


1. (b) Y =(a-bd +p + Gy) fl - AU = 9] 

T* = [dU —b) +t(a + ly + Go)]/{l — Ul -] 
Ct =[a—bd + ACL - Uy + Go)/EE - A - 1] 

3. Hint: After substituting the last two equations into the first, consider the resulting 
equation as a quadratic equation in the variable w = Y?, Only one root is acceptable, 
wt = 11, giving Y* = 121 and C* = 91. The other root Leads to a negative C*. 

Exercise 4.1 


1. The elements in the (column) vector of constants are: 0, a, —c. 


Exercise 4,2 
73 2-3 
L@ [ 7] ) [is | 


1 
3. In this special case, 4B happens to be equal to B.A = : 
0 


49 3 3x + Sy 
40 [ 4 :] 2) [anh | 


e-o 


roo 
uN 


(2x2) ax 
6 (a) 2 Fxg txatxs (ce) AC xe +43 +4) 
4 
7 Vatuitd (a) Hint: x =| tory £0 
f=? 


Exercise 4.3 


155 45 xP ox, KiX3 
1. (a) w=] 3 1 -1 (0) xx’ =] xox, x2 9x3 


= 2 
9 3 -3 X3X, X3X20 XF 


{e) wv = 13 (g) wn = 35 
3. {@) y PQ; (b} P-QorP'Q or QP 
isl 


5. (a) w =[¢] w@a-v=[ 5] 


L@d=VJi 
9. (ce) d(v,0) =(v-u)!? 


Exercise 4.4 


L@ [ii i 


664 Answers to Selected Exercises 


2, No; it should be 4- B=-B+ 4. 
4. (a) KA + B) = Klay; + Bij] = [kay + kby} = [kay] + [Ady] = kay] + kd] = 
kA+kB — (Can you justify each step?) 


Exercise 4.5 


L@ an=[4 3 i] © n= |" 


3. @) 5x3 (c) 2x) 
4, Hint: Multiply the given diagonal matrix by itself, and examine the resulting product 
matrix for conditions for idempotency. 


Exercise 4.6 

,_{0 -l »_{| 3 0 
natal 3 and B’ = 3 | 
3. Hint: Define D = AB, and apply (4.11). 
5, Hint: Define D = AB, and apply (4.14). 


Exercise 5.1 

1 @ 62) © 63) (@) 63) 

3. {a} Yes. (d) No. 

5. (a) r(A4) =3;Ais nonsingular. — (6) r(B) = 2; B is singular. 


Exercise 5.2 

1. (a) -6 (ec) 0 (e) Babe - PB - 3 
a_|¢ f 

[Col = ei 


4. (a) Hint: Expand by the third column. 
§. 20 (not —20) 


a|? f 
2 Imi =|4 j 








Exercise 5.3 

3. (a) Property IV. (b) Property III (applied to both rows). 
4, (a) Singular. (c) Singular. 

5, (a) Rank <3 (c) Rank <3 

7, Ais nonsingular because |A| = | — 6 #0. 


Exercise 5.4 
4 4 
LYaslCal Ya |Cay| 
j=l fol 


3. (a) Interchange the two diagonal elements of 4; multiply the two off-diagonal 
elements of 4 by —1. 


(8) Divide by | 4]. 


Answers to Selected Exercises 665 


3.02 -3 1 0 
4) E15 —7 2 7 () G'=]0 0 
01 


—-6 -4 26 


Exercise 5.5 
E. (a) xf =4, and xj =3 (c) xf =2, and xf =1 


a If 12), 74 a_1f 17), 4_f2 
bones ipeeb] ot 84 dee[l 


3. (a) xf =2,43 =O, xf = 1 (ce) x* =0,y* =3,25=4 
4, Hint: Apply (5.8) and (5.13). 


Exercise 5.6 


1 1 -b 
L@ Ae b(l-1) 1 -b 
ere yg tab 


yr" 1 Jo + Gota—bd 

Ct = | AL -—t)Uo + Go} +a — bd 

re] bb F y+ Go) +ar tall —B) 

(b) |Al=1-6+6t \A\| = fo + Gu -bdt+a 

|4o| =a — bd +811 —{fo+ Go) [43] = dl — b) 4+ ta + ly + Go) 
Exercise 5.7 
1, xf = 69.53, xf = 57.03, and xf = 42.58 
_ [0.10 0.50]. ~ | 60.90 —0.50]] x, | _ | 1,000 

3. (@) 4= [ee 0 | ; the matrix equation il en 100 [3] = eal 

(c) xf = 3,333, and x} = 4,000 
4, Element 0.33: 33¢ of Commodity II is needed as input for producing $1 of Commodity J. 
Exercise 6.2 
1, (@) AyfAx =8x+4Ax (6) dy/dx=8 = ©) f'G) =24, f= 
3. (a) Ay/Ax = 5; a constant function. 
Exercise 6.4 
1. Left-side limit = right-side limit = 15; the limit is 15. 
(Qs (by 3 
Exercise 6,5 
L. (a) -3/4<x (e) x < 1/2 
3. (a) -Tcx <5 (ec) -4<x <1 
Exercise 6.6 


1. (a7 (oe) 17 
3. (a) 24 (2 


666 Answers to Selected Exercises 


Exercise 6.7 
2. (a) N?-5N-2 (6) Yes. (c) Yes. 
3. (a) (N +.2)/(N? +2) (b} Yes. {c) Continuous in the domain. 
6. Yes; each function is continuous and smooth, 
Exercise 7.1 
1. (a) dy/dx = 12x"! (©) dy/dx = 35x4 (e) dw/du = —2u7'? 
3. @) fe) = 185 C1) = £2) = 18 
(o) f(z) = 10x; #1) = 10, (2) = 14 
Exercise 7.2 we 
1, VC = G'- 5g? + 120: io = 39° — 109 + 12 is the MC function. 
3. (a) 3(27x? + 6x — 2) (c) 12x(x +1) (e) —x(9x +14) 
4. (b) MR = 60-60 
7. (a) (x? —3)/x? (ce) 30/(x + 5° 
8. (a) a (c) -af(ax + bY 
Exercise 7.3 
1. —2x[3(5 - x?)? +2] 
3. (a) 18xC3x? — 13° te) Sa(ax + 6)" 
S.x=hy—3,dx/dp =} 
Exercise 7.4 
L. (a) av/dxy = 6x? — 22xyx2, and Ay/4xy = —1 Ls? + 6x 
{e) av/Ax; = 2x2 — 2), and @y/Ax2 = 2x, +3 
3. (a) 12 (c) 10/9 
5. (a) Ur = Axi + 22 + 3, and Uy = 3041 + 2% +3? 
Exercise 7.5 
1. dO"/da =df(b+d)>0 aQ*/ab=-d(at+ o/(b+dy <0 
dO" dc = —b/(b+d) <0 AQ*/ad = bla to)/(b+dy > 0 
2. a¥*/8ly = aY*/30 = 1/U — f+ fd) > 0 
Exercise 7.6 
1. (a) |.J| = 0; the functions are dependent. 
(b) |/| = —20x2; the functions are independent. 
Exercise 8.1 
1, (a) dy = —3(x? + dx (0) dy =[(1 — x?) (x? + Ide 
3. (a) dC/d¥ = band C/Y = (a + bY)/Y 
Exercise 8.2 
2. (a) dz = (6x + p)dx + (x — by?) dy 


Answers to Selected Exercises 667 


3. (@) dy = Pof(x) +)" ]dx — [rif +22)" den 

4, egp = 2DP*/(a + bP? + R1?) 

6. exp = -2/(V/? PP? + 1) 

Exercise 8.3 

3. (a) dy = 3[(2x2 — 1)(x3 + 5) day + 2ai(x3 + 5) dag + 1 (2x2 — 1) dx3] 

4. Hint: Apply the definitions of differential and total differential. 

Exercise 8.4 

1. (a) dz/dy =x + 10y + 6y? = 28y + Oy? 
(c) dz/dy = —15x + 3y = 108y — 30 

3. dQ/dt =[aad/K + bBA/L + A'(ty]KeL? 

4. (6) §W/Su = uf th §W/u=3A- lw’ f 

Exercise 8.5 

5, (a) Defined; dy/dx = —(3x? — 4xy + 3y*)/(-2x? + 6xy) = —9/8 
(b) Defined; dy/dx = (4x + 4y)/(4x — 4y?) = 2/13 

7. The condition F, # 0 is violated at (0, 0). 

8. The product of partial derivatives is equal to —1. 


Exercise 8.6 
1. (0) (€¥*/dGo) = I/(S' + T'- 1) > 0 
3. (AP*/9%) = Dy,/(Spe — Dp-) > 0 (8Q*/8¥) = Dy,Spo/(Sp- — Dpr) > 0 
(4P*/8T) = —Sp/(Sp— Des) > @ (8Q*/AT%) = —SpDp-(Sps — Dov) < 0 
Exercise 9.2 
1, (a) When x = 2, y = 15 (a relative maximum). 
{c) When x = 0, y = 3 (a relative minimum). 
2. (a) The critical value x = —1 lies outside the domain; the critical value x = 1 leads to 
¥ =3 (a relative minimum). 
4. (d) The elasticity is one. 
Exercise 9.3 
1. (a) f(x) = 20, f"@)=0 ©) fi") = 6 = xy, f"@) = 18 = 4 
3. (8) A straight line. 
5. Every point on /(x) isa stationary point, but the only stationary point on g(x) 
we know of is atx = 3. 
Exercise 9.4 
1. (a) (2) = 33 is a maximum. 
(ce) fU) = st isa maximum, f(5) = -si isa minimum. 
2. Hint: First write an area function A in terms of one variable (either / or #’) alone. 
3. (d) Q* = 11 — (e) Maximum profit = 111} 


668 Answers to Selected Exercises 


Ra k<0 (bh <0 ( j>o 
7. (b) Sis maximized at the output level 20.37 (approximately). 


Exercise 9.5 

L@ 1200 (4 © (Ht+2n+) 
La l+xtvrtxi tet 

3. (b) —63 — 98x — 62x? — 18x39 — 2x4 + Ry 


Exercise 9.6 
1. (a) f(0) = 0 1s an inflection point. (c) {(0) = 5 isa relative minimum, 
2, (b) f(2) = Oisa relative minimum. 


Exercise 10.1 

1. (a) Yes. (5) Yes. 

3, (a) Se (e} — 120" 

5. (a) The curve with a = —] is the mirror image of the curve with a = | with reference 
to the horizontal axis. 


Exercise 10.2 
1. (a) 7.388 () 1.649 


2 (ce) 142 Loy? ! 2x) 
1» {¢) Fax t a (2xy + (2a + 
3. (a) $702"? (6) $6900"'° 


Exercise 10.3 

(a4 (ce) 4 

2 (a7 (c) -3 (2) 6 

3. (a) 26 (c) m3-NB (3 


Exercise 10.4 
1. The requirement provents the function from degencrating into a constant function. 
3. Hint: Take log to base b. 
4, (a) y = 89% op y = e628 fe) y = SelM or p = Sel 695 
5. (a) t= (Iny)/{In 7) or = 0.5139 In y 
{e) ¢ = 3 In(9y)/(In 15) or ¢ = 1,1078 In(9¥) 
6. (a) r=in105 r= 2In1.03 


Exercise 10.5 

1. (a) 204 (c) 2te"t! (e) (2ax + bgt tiste 
3.) 5/f EFL) @ W/O +3) 

5. Hint: Use (10,21), and apply the chain rule. 

7. (a) (8 — P/M +27 +47") 


Answers to Selected Exercises 669 


Exercise 10.6 
Lf sl/r? 
2, PAfdi? = —AIN/4AVP <0 


Exercise 10.7 
Lia 2/t mb fe) I/t-n3 
3. ry = ry 
Tlegl =a 

IL. rg = egere + €oits 


Exercise 11.2 

1. 2* =3 isa minimum. 

3. 2* = ¢, which is a minimum in case {a), a maximum in case (4), and a saddie point in 
case {c). 

5. (a} Any pair (x, y) other than (2,3) yields a positive z value. 
(B) Yes. = () No. (a) Yos (dz = 0). 


Exercise 11.3 
1. (a) ¢ = 4a? + duv $30? (c) q = 5x? + bxy 
3. (a) Positive definite. {c) Neithet. 
5. (a) Positive definite. {c) Negative definite. (e) Positive definite. 
6. (an, =H(7 + J17); u' Du is positive definite. 

(¢) rijr2= 45 + J61); 1 Fu is indefinite. 

woe _ Bea 
1/5 2/5 

Exercise 11.4 
1, 2* = 0 (minimum) 
3. 2* = —11/40 (minimum) 
4, z* = 2 —e (minimum), attained at (x*, y*, wt} = (0, 0, 1) 


5. (b) Hint: See (11.16). 
6. (@) 1 =2 nm4d+VJ6 ry =4-V6 


Exercise 11.5 


1, (a) Strictly convex. (c) Strictly convex. 
2. (a) Strictly concave. — (¢) Neither, 
3. No. 


5. (a) Disk. (b) Yes. 
7. (a) Convex combination, with @ = 0.5. () Convex combination, with # = 0.2. 


670 Answers to Selected Exercises 


Exercise 11.6 

1. @) No, (b) Qt = Pio/A and Qs = Po/4 
3 lenl=13 leal=14 leasl = 15 

5. (a) 7 = PyQ(a, BY 1+ Sig)? — Praga — Prob 


Exercise 11.7 

1. (8a*/O Pao) = PoQme "(F< 0 (86*/9Pan) = —PoQase"/Id| < 0 

2. (a) Four, (6) (8a"/9 Pa) = (On Qu — QaQan) Poll + io)? /[/] > 0 
(c) (Ba* fig) = (Qa Qos — On Oar) PFU + toy 7/4] < 0 


Exercise 12.2 
1, (a) 2* = 1/2, attained when 4* = 1/2, x* = 1, and y* = 1/2 
(c) z* = ~19, attained when At = —4. x* = I, and y* =5 
44, = Gaya Ze= fp-AGe =0 Zp=fy-AG, =0 
5, Hint: Distinguish between identical equality and conditional cquality. 


Exercise 12.3 


1. (@) (H| =4;2* isa maximum. (co) |H| = —2;2* is a minimum. 


Exercise 12.4 

2. (a) Quasiconcave, but not strictly so. (c) Strictly quasiconcave. 
4. (a) Neither. (c) Quasiconvex, but not quasiconcave. 

5, Hint: Review Sec. 9.4. 

7, Hint: Use either (12.21) of (12.25’), 


Exercise 12.5 

1. (6) At =3.x* = 16,y"= 11 — (c) [A] = 48; condition is satisfied. 

3. (0x"/8B) = 1/2P, > 0 — (Ox*/8P,)=—-(B+P,)/2P? <0 
(8x"/8P,) = 1/2P, > 0 ete. 

5, Not valid. 

7, No to both (a) and (b}—see (12.32) and (12.33’). 





Exercise 12.6 

1. (a) Homogeneous of degree one. —_ (c) Not homogencous. 
(e) Homagencous of degree two. 

. They are true, 

(a) Homogencous of degree a + 6 +c. 

(a) P.O =sUK,jL) —() Hints Let j = 1/h. 

(d) Homogeneous of degree one in K and L. 


4 
1 
8 


Answers to Selected Exercises 671 


Exercise 12.7 

1. (a) 1:2:3 (b) 1:4:39 

2. Hint: Review Figs. 8.2 and 8.3. 

4, Hint: This is a total derivative. 

6, (a) Downward-sloping straight lines. (bh) o > wasp -l 
8. (a) 7 {c) n5-1 

Exercise 13.1 


3. The conditions x,{9Z/x,) = 0 and the conditions 4,(0Z /04,) = 0 can be condensed. 
5. Consistent. 


Exercise 13.2 

1, No qualifying are can be found for a test vector such as (dx), dx2) = (1, 9). 

3. (xf, x3) = (0, 0) is a cusp. The constraint qualification is satisfied (all test vectors are 
horizontal and pointing castward); the Kuhn-Tucker conditions are satisfied, too. 

4. All the conditions can be satisfied by choosing yg = 0 and yy} > 0. 


Exercise 13.4 
2. (a) Yes. (b) Yes. (c) No. 
4. (a) Yes. (b) Yes. 


Exercise 14.2 

A. (a) -8x 40,0020) (©) fx Fx? te 

2. (a) 3e? te (} Se 3 +¢,@ 49) 
3 (a) Sin Fe #0) (c) Inv? +3) +e 

4, (a) He + 1x 43) - FO 4D? te 

Exercise 14.3 

L@d 3b (5 “) 

2. (a) fe? - “4 (e-) &(fet - te? +e-1) 

3. (6) Underestimate. (e) f(x) is Riemann integrable. 
Exercise 14.4 

1, None. 

2, (a), (e), (@) and {e). 

3. (a), (c) and (d) convergent; (e) divergent. 

Exercise 14.5 

1, @ R(Q)=14Q?~ PPL +P (b) RQ) = 10/1 + O) 
3. (a) K(t) = 9143 + 6 

8. (a) 29,000 


672 Answers to Selected Exercises 


Exercise 14.6 


1. Capital alone is considered, Since labor is normally necessary for production as well, 
the underlying assumption is that K and L are always uscd in a fixed proportion. 


3. Hint: Use (6.8). 
4. Hint: Inu —Inv = In- 
v 


Exercise 15.1 
(a) (=e 43 () = -e ) 
3. (a) p(t) =4(1 7) (e) pt) = be” (e) yt) = 8e" - 1 


Exercise 15.2 
1, The D curve should be stecper. 
3, The price adjustment mechanism generates a differential equation. 


B+é @ 
5. (a) P(t) = Aexp (**.) + B+ (6) Yes. 
Exercise 15.3 
1. y(t) = de® 43 
Rr) set +5 
5. yith=e ® — fel 
6. Hint: Review Sec. 14.2, Example 17. 


Exercise 15.4 
1. @) yh =C/F)? ©) yt yPtae 


Exercise 15.5 
. dy 1 
1. (a) Separable; linear when written as h + ? =0 
¢ 


{c) Separable; reducible to a Bernoulli equation. 
3. y(t) = (4 ~ PY? 


Exercise 15.6 

1, (a) Upward-sloping phase line: dynamically unstable equilibrium. 
(c) Downward-sloping phase line; dynamically stable cquilibrium. 

3. The sign of the derivative measures the slope of the phase linc. 


Exercise 15.7 

Lor sre ry, {ef. (10.25)] 

4. (a) Plot (3 — y) and Iny as two separate curves, and then subtract. A single 
equilibrium exists (at ay value between | and 3) and is dynamically stable, 


Answers to Selected Exercises 673 


Exercise 16.1 


1.@) %=27/5 Oy=3 © y=6? 
3. @) vit) = be te" 3 (e) yp) =el tte +3 
6. Hint: Apply LU Hépital’s rule. 


verse 16.2 
@ Rb 3V3 (co) pt 7 
(6) Hint: When @ = 7/4, line OP is a 45° line. 


: 
5. (a) © sin f(0) = f'(@)cos f(8) ——(B) & cosa! = —39* sing? 
2a) V3t+i @l-i 


Exercise 16.3 
L. y(t) = eG cos 2¢ + 4 sin 2) 





3. (hoe? (- cos 4 + 4 sin +1) +3 
S. (0) = 300832 + sin3r + 4 
Exercise 16.4 
mau B+é at+y aty 
1. Pl 4 —— pl— ——— = - b) P, = 
pow pew AM OF bts 


3. (a) P(t) = ef!?(2eos 32 + 2sin $4) +2 


Exercise 16.5 


dx 
1 (@) 7 tJ g) = fla — Fp) 
(b) No complex roots; no fluctuation. 
3. (c) Both are first-order differential equations. (@)g#l 
= = | 


vz . v2 2 
4, (a) x(t) =e! («0 ait Agsin ra +m (ce) PamU= an” 
Exercise 16.6 
LW) y=l-2 O yay 
Exercise 16,7 
1. (a) yp =4 ©) y= HP 
3. (a) Divergent. (c) Convergent. 
Exercise 17.2 


1@ Mer =M+7  (C) Me = 3% - 
3. (a) y= 104d (co) yr = poor! pita tat pete 


674 Answers to Selected Exercises 


Exercise 17.3 
1. (a) Nonoscillatory; divergent. (c) Oscillatory: convergent. 
3, (@) y= 80/3) +9 (c) y= —2(-1/4)' +4 


Exercise 17.4 

1 Qa —fUPo— Pyl—-a/py - BP 

3. (a) P = 3; explosive oscillation, — (c) P = 2; unifortn oscillation. 
5, The lag in the supply function. 


Exercise 17.5 
la=-l 
3. P, = (Po — 3)(-1.4)' + 3, with explosive oscillation. 


Exercise 17.6 
1. No. 
2. {b) Nonoscillatory, explosive downward movement. 
(@) Damped, steady downward movement toward 2. 
4. (a) At first downward-sloping, then becoming horizontal. 


Exercise 18.1 
L@itt @4-1 


3. (a) 4 (stationary) —{c) 5 ¢stationary) 
, = V3t wrt sine 
4. (6) y= v2 (200s qi t sin “1) +1 


Exercise 18.2 


1, (a) Subease 1D. (c) Subease IC. 
3. Hint: Use (18.16). 


Exercise 18.3 

3. Possibilities v, viii, x, and xi will become feasible. 

4.) Pra -(2- JO ~ 8) — Bela + AL — 8) — BRO - Spe = iBone 
() pko4 


Exercise 18.4 

1. (a) 1 (ce) 3° 43441 
R@) yaht (ly s2-04eP 
5. (@) 1/2, —L and 1 


Answers to Selected Exercises 675 


Exercise 19.2 

2 ob +6?-3b42=0 

3 (@) x =-B¥ +42 470 ye = 203) + 2-2) +5 
4, (@) x(t) =4e7 Be $12 yu) ee pe % +4 


Exercise 19.3 

2. (c) B= (81 - Ayu 

3. (0) B=(of +1 - AA 

5. (co) a() = de B10 4 ag —Hles10 et/t0, wilt) = eA 2 dg HeI0 Berio 


Exercise 19.4 
4, ole] a a) a C=) 


u,| =| 3-193 | 64 + 23+ S193 | 64 
48 ' eB 
be 
+ Li 
é KB) 


Exercise 19.5 

1, The single equation can be rewritten as two first-order equations. 
2. Yes. 

4, (a) Saddle point. 


Exercise 19.6 
1. (a) |Je| = Land tr / = 2; locally unstable node. 
{c) |Jg| = 5 and tr Jg = —1; locally stable focus. 
2. {a) Locally a saddle point. (c) Locally stable node or stable focus. 
4, (a) The x’ = 0 and py’ = 0 curves coincide, and provide a lineful of equilibrium 


points. 
Exercise 20.2 
l~t re 
ele * * = 42 
Lit=l-1 ou 5 ya5-qgt 


6A) =3e 3 WH) =2— op" = Te -2 


Exercise 20.4 
Lasse te) Kt =1/2(8 +e) 


Index 





A 

Abscissa, 36 

Absolute extremum, 222-223, 291, 
319, 347 

Absolute value 


of complex numbers, 512 
highest, of dominant root, 574 
inequality and, 137-138 
marginal rate of technical 

substitution and, 199 

Absorbing Markov chains, 81 

Acceleration coefficient, 576 

Accelerator, interaction with 

muitiplier, 576-581 

Adaptive expectations, 

533, 558, 581 

Additive constant, 153 

Additivity property, 459 

Adjoint, 100 

Adjustment coefficient, 480 

Algebraic function, 23 

Alien cofactor, 99-100 

Amplitude, 516 

Antiderivative, 446, See 

also Integral 

Antilog, 472 

Area under a curve, 455 458 

Argand diagram, $12 

Argument, 18 

Arrays, matrices as, 49-50 

Arrow, K. J., 369n, 397n, 

425n, 426n 
Arrow-Enthoven sufficiency 
theorem, 425-426 

Associative law 
of matrix operations, 67, 68-69 
of set operations, 13 

Asymptote, 23 

Autonomous differential 

equation, 496. 

Autonomous problems, 644-645 

Auxiliary equation, 506 

Average, weighted, 328 

Average cost, vs. marginal 

cost, 159-160 

Average revenue 
marginal revenue vs.. 156-158 
in relation to demand, 332-333 


B 


Balance of payments, 214 
Base 
of exponential function, 256, 259 
of logarithmie function, 267-269 
Base conversion, 257, 274-276 
Basis, 63 
Behavioral equation, 6-7 
Bernoulli equation, 493, 494, 501 
Boltyanskii, V G., 633n 
Bordered discriminant, 358 363 
Boundary condition, 445 
Boundary irregularities, 412-414, 415 
Boundary solution, 403 
Budget constraint, 348, 374-375, 
418-420 


c 


Calculus of variations, 631 
Capacity constraint, 420-423 
Capital 
dynamics of, 498-502 
investment and, 465-467 
Capital flows, 213 
Capital formation, 465-467, 607-608 
Capitalization formula, 470 
Cartesian coordinates, 519, $72 
Cartesian product, 16 
Cash flow, present value of, 468 469 
CFS production function, 397-400 
as quasiconcave fimction, 398 
in relation to Cobb-Douglas. 
production function, 399-400 
Chain rule, 161-163, 190, 193, 289 
Change, rate of. See Rate of change 
Change of official settlement, 2140 
Channel map, 190, 191, 192, 210. 
Characteristic equations, 
506, 601-602 
of difference equation, 570 
of difference-equation system, 
595, 598 
of differential equation, 506 
of differential-equation system, 600. 
of matrix, 308 
Characteristic matrices, 308 





Characteristic roots 
of difference equation, 570-573 
of difference-equation system, 595 
of differential equation, 506-510 
of differential-cquation system, 399 
domination of, $74 
dynamic stability of equilibriutn and, 
510,527,573 575 
sign definiteness of quadratic form 
and, 307-311 
Characteristic vector, 307, 308 
Chenery, H. B., 397n 
Chiang, A. C., 3, 302n, 631n 
Choice variable, 221 
Circular function, 23, 513-515 
Closed input-output model, 119-120 
Closed interval, 133 
Cobb-Douglas production function, 
337, 386-388, 389 
applications of, 393, 501 
elasticity of substitution of, 396 
expansion path of, 393 
in relation to CES production 
function, 399-400 
as strictly quasiconeave 
function, 386 
Cobweb model, 555-558 
Coefficient(s), 6 
acceleration, 576 
adjustment, 480 
constant, 503 
fractional, 39 
input, 113 
undetermined, 538--540, 586-588, 
604, 607 
of utilization, 473 
Coefficient matrix, 50 
Cofactor 
alien, 99 100 
defined, 91 
Cofactor matrix, 100 
Column vector, 50, 53, $5 
Commutative law 
of matrix operations, 67 
of set operations, 13 
Comparative-static derivative. See 
Comparative statics 
Comparative statics, 121, 124 125 
of input-decision model, 343-345 


677 


678 Index 


Comparative statics (Continued) 
of least-cost-combination 
model, 392. 396. 
of market models, 205 207 
of multiproduct firm, 342-343 
of national-income models, 210-213 
total derivative applied to, 209 210 
of utilily-maximizalion 
model, 378-382 
Complement set, 12 
Complementary functions 
dynamic stability of equilibrium 
and, 481, 551 
of first-order difference 
equation, $48-849 
of first-order differential 
equation, 477, 478 
of higher-order difference equation, 
569, 570-573, 594-595 
of higher-order differential equation, 
504-505, 522.524, 541 
of simaultancous difference cquations, 
597, 598, 600 
of variable-coefficient differential 
equation, 485 
Complementary slackness, 404, 406, 
407, 408 409, 419 
Completing the square, 37, 239n, 
303, 305 
Complex numbers, 511-312 
alternative expressions for, 519-521 
conjugate, 512-513 
Complex routs, 507-510, $12 $13. 
572,573, $79 
Composite-function rule, 162. See also 
Chain rule 
Compressing, 258, 274 
Coneave functions, 330 
cotivex functions vs., 230-231, 
318-320 
criteria for checking, 320-324 
in nonlinear programming, 424 425 
Concave programming, 425 
Conditional equation, 7 
Conjugate complex numbers, 312-313 
Consiant(s}, 302 
additive, 153 
defined, 6 
exponents as, 256 
of integration, 446 
multiplicative, 153 
parametric, 6 
Constant coefficients, 503 
Constant function, 20, 21, 148-149, 187 
Constant-function rule, 148-149, 187 








Constant returns to scale (CRIS), 384, 
386-387, 390, 397 
Constrained extremum. See also Linear 
programming: Nonlinear 
programming 
determinantal test for, 362 
in relation to quasiconcavity and 
quasiconvexity, 372-374 
Constrained optimum, 347 
Constrained quadratic form, 358-359 
Constraint 
budget, 348, 374-375. 418-420. 
capacity, 420-423 
elTects of, 347 349 
inequality, 404-408 
linear, 416-418 
imulticonstraint cases, 354-355, 
362-363 
in nonlinear programming, 
404-408 
ration, 418-42) 
Constraint qualification, 412, 415-418 
Constraint-qualification test, 426 427 
Consumption function, 46, $76 
Continuity, 141-142 
of derivative function, 154 
of polynomial function, 142 
of rational function, 142 143 
in relation (o differentiability, 
143 147 
Continuity theorem, 142 
Continuous growth, 265-266 
Continuous time, 444 
Continuous variable, 444 
Continuously dillerentiable 
functions, 154, 227 
Control variable, 631 
Convergence, $65 
divergence vs.. 578-581 
of improper integral, 461 464 
of serics, 249, 261 
Convergent time path, 526, See also 
Dynamic stability of equilibrium 
Convex combination, 328-330 
Convex functions 
concave lunetions vs. 230 231, 
318 320 
convex set v8., 327-330 
criteria for checking, 320-324 
in nonlinear programming, 424 
Convex set, vs. convex function, 
327 330 
Coordinate(s) 
Cartesian, 519, 572 
polar, 520 


Cosine function, 514 
derivative of, $17 
Maclaurin series of, 518 
properties of, 515-317 
table of values of, $15, 520 
Cost(s) 
average vs. marginal, 159° 160 
marginal vs, total, 128-129, 
153, 464-405 
minimization of, 390-401 
Cost functions, 7 
cubic, 238 242 
relation berween average and 
marginal, 159-160 
relation between marginal and tolal, 
128-129, 153, 464 465 
Costale equation, 633, 634, 638 
Costate variable, 633 
Counting equations and unknowns, 44 
Courant, R,, 2530 
Cramer's rule, 103 107, 605, 607 
Critical value, 224 
Cross effect, 381 
Cross partial derivatives, 296 
CRTS. See Constant returns to 
scale (CRTS) 
Cubi¢ equation, vs. cubic function, 35n 
Cubic finetion, 21, 22,38 
cost functions, 238-242 
cubic equation vs., 35n 
Curcent-value Hamiltonian 
function, 645, 651 
Current-value Lagrange multiplicr, 645 
Cusp, 413, 414 





D 


Damped fluctuation, 326, 561 
De Moivre's theorem, 521, 572 
Decay, rate of, 266 
Definite integral, 447, 454-46] 
as area under a cutve, 455-458 
properties of, 458-460 
Definiteness, positive and negative, 302, 
306, 307, 311 
Definitional equation, 6 
Degree 
of differential equation, 475 
highersdegrce polynomial 
equations, 38-40 
of polynomial function, 21 
Demand, 31, 32, 35 
average revenue and, 332-333 
clasticity of, 187, 335-336 





excess, See Excess demand 
final, 113 

Hicksian demand fanctions, 436 
input, 113 


Marshailian, 435, 437, 438, 439 
with price expectations, 527-528 
Demarcation curves. 615-617 
Demarcation line, 615, 616 
Denumerable set, 9 
Dependence 
among columns or rows of matrix, 96 
among equations, 44-45, 85 
linear, 62-63 
Dependent variable, 18 
Derivation, 143 
Derivative(s), 126-127 
comparative-static. See Comparative 
statics 
continuity of, 154 
of cosine function, 517 
derivative of, 227-220 
of exponential functions, 278-280 
fifth, 228 
first, 223-226 
fourth, 228 
marginal function and, 128 129, 153 
partial, See Partial derivative 
partial total, 192, 193 
tules of. See Differentiation rules 
second, 227-233 
third, 228 
total, 189-194, 209 210 
Derivative conditions, vs. differential 
conditions, 291-293 
Derivative function, 127 
Descartes, R., 16 
Determinant, 45, 48, 88 98 
defined, 88 
factoring, 95 
first-order, 137n 
Hessian. See Hessian determinant 
Jacobian, See Jacobian determinant 
Laplace expansion of, 91-93 
ath-order, 91-94 
properties of, 94-96, 98 
second-order, 89 
third-order, 89-91 
vanishing, 89, 95 
zery-value, 89, 95 
Determinantal test 
for relative constrained 
extremum, 362 
for relative extremum, 317 
for sign definiteness of quadratic 
form, 302-304 


Deviation, 244 
Diagonal matrix, 69, 73 
zation, of matrix, 310-311 
¢ also Phase diagram 
Argand, 512 
Venn, 12 
Difference 
first, 545 
second, 568 
Difference equation, $44. See also 
Complementary functions; 
Particular integral; Simultancous 
difference equations 
classification of, 545, 568, 586, 588 
definite vs. general solution of, 548 
iterative method of solving, 546-548 
particular solution of, 548 
phase diagram for, 562-567 
Difference quotient, 125-126 
Differentiability 
continuity in elation to, 143-147 
twice, 154, 227 
Diiferentiable functions, 
324-327, 368-372 
Differential 
mules of, 187-189 
total. See Total differential 
Differential calculus, 125 
Differential conditions, vs. derivative 
conditions, 291-293 
Differential equation, 446, See also 
Complementary lwnelions; 
Particular integral; Simultancous 
differential equations 
autonomous, 496 
classification of, 475-476, 483. 484, 
486-487, 492, 503, 540 
definite vs. general solution of, 476 
degree of, 475 
exact, 486-490 
homogeneous, 476, 478 
nonhomogeneous, 476: 478 
normalization of, 475m 
particulat solution of, 476 
phase diagram for, 
495-498, 500-501 
reduced, 477 
with separable variables, 492 493 
Differentiation 
differentiability vs., 143 
exponential-function rule of, 278 
total, 185, 190. 
Differentiation rules 
chain tule, 161-163, 190, 193 
constant-function rule, 148-149, 187 





index 679 


exponential-function rule, 278 
implicit-function rule, 197-198, 
202, 387 
log-function rule, 277 278 
power-function rule, 149.152, 187 
product rule, 155-156, 187 
quotient rule, 158-159, 187 
sum-difference rule, 152-155, 187 
Diminishing returns, 239, 499 
Direct product, 16 
Discount, quantity, 131n 
Discount factor, 266 
Discounting, 266, 283. See aése 
Present value 
Discrete growth, 265-266 
Discrete time, 444 
difference equations and, 544-545 
dynamic stability of equilibrium 
with, S5t 554, 573-575 
Disercte variable, 444 
Discriminant 
bordered, 358-363 
determinant vs., 303 
Disjoint set, IL 
Distance, 64-65 
Distinct real roots, 507-508, 570-571 
Distribution parameter, 397 
Distributive law 
of matrix operations, 67, 69 
of set operations, 13-14 
Divergence, vs. convergence, 578-581 
See alsa Convergence 
Divergent time path, 526 
Domain, 18. 19 
Domar, E. B., 47]n 
Domar growth model, 471-474, 475 
Dominant root, 574 
Domination, of characteristic roots, 374 
Dorfinan, R., 45n 
Double roots, 508 
Dual problems, 435-441 
Duality, 435n, 436-437 
Dunn, Sarah, 79n 
Dynamic analysis, limitations of, 654 
Dynamic equations 
high-order, transformation of, 
593-594 
simultaneous, solving, 594 603 


Dynamic instability, 497 
Dynamic optimization, 442, 631 


Dynantic stability, 497 


Dynamic stability of equilibrium. 


481-482 
with continuous time, 510, 525-527 
with discrete time, 551-554, 573-575 


680 index 


Dynamic stability of equilibrium 
(Continued) 
local stability of nonlinear system, 
623, 625-629 
phase diagram and, 495-498, 
562-565, 619-620 
Routh theorem and, 542 543 
Dynamic systems, genesis of, 592-394 
Dynamics, 444 
of capital, 498-502 
of inflation and monetary rule, 629 
of inflation and unemployment, 
532-537, 581-585, 609-614 
of input-output madels, 603-609 
integration and, 444 446 
of investment, 498 502 
of market price, 479-483, 527-532, 
553-362, 565-367 
of national income, 576-581 





i 


@, the nuntber, 260-262 
Echelon matrix, 86.87 
Econometrics, vs, mathematical 
economies, 4 

Economic madel, 5-7 
Economically nonbinding solution, 420 
Efficiency parameter, 388, 397 
Figenvatue, 307n 
Eigenvector, 307m 
Elasticity 

chain rule of, 289 

of demand, 187. 335-336 

of optimal input, 395 

of oulpul, 38% 

partial, 186, 187 

point, 288-289 
Elasticity of substitution 

of CES function, 397 

of Cobb-Douglas function, 396 
Elimination of variables, 33-34, 

LL, 16 

Endogenous variables 

exogenous variables vs., 5-6 


Jacobian determinant, 203, 208, 212, 


343 344, 353 
Enthoven, A. C., 369n, 425n, 426n. 
Envelope theorem, 428441 
for constrained optimization, 
432. 433 
derivation of Roy's identity 
and, 437-438 


maximum-value functions and, 
48435 
for unconstramed optimization, 
428-432 
Equality 
matrix, 51, 56 
of sets, 10 
Fquation(s) 
auxiliary, $06 
behavioral, 6 7 
Bernoulli, 493, 494, 501 
characteristic. See Characteristic 
equations 
conditional, 7 
costate, 633, 634, 638 
cubic, 35n, 
definitional, 6 
differential. See Differential equation 
exponential, 268, 271 
homogencous, 476, 478 
of motion, 631, 633-634 
nonhomogencous, 476478 
quadratic. See Quadratic equation 
reduced, 477 
state, 633-034, 644-645 
Equation system 
consistency and independence in, 
4-45, 85 
dynamic. See Simultancous 
difforonce equations; 
Simultaneous differential 
equations 
bomogeneous, 105-106, 119-120, 
595, 598 
linear, 48, 77 78, 106. 107 
Equilibrium, 30-47 
defined, 30 
dynamie stability of. See Dynamic 
stability of equilibrium 
general, 40 45 
foal, 31,220 
intertemporal, 480, 481 
moving vs, stationary, 482 
in national-income analysis, 46-47 
open-economy, 214 216 
partial, 31, 43 
types of, 618-620 
Equilibrium analysis. See Static analysis 
Equilibrium condition, 7 
Equilibrium identity, 206, 208, 211, 212 
Equilibrium output, 236 
Equilibrium values, 32 
Euclidean n-space, 60, 64. 63 
Euler relations, $17-519 





Euler’s theorem, 385-386, 388-389 
Fxacl differential equation, 486 490 
Excess demand, 31, 41 

output adjustment and, 605-607 

price adjustment and, 480 

in relation to inventory, 
Exchange rate. fixed, 214 
Exhaustible resource, 647 649 
Exogenous variables, §-6 
exp, 259 
Expansion path, 392-394 
Expectations 

adaptive, 933, 558, 581 

inflation, 533, 336, 581 

price, 527-528, 558 
Expectations-augmented Phillips 

relation, 533 534, 581] 
Expected rate of inflation, 536 
Expected utility from playing, 232 
Explosive fluctuation, 525-526 
Explosive oscillation, 566, 596 
Exponent(s), 21, 23 24, 256 
Exponential equation, 268, 271 
Exponential function(s), 22, 23 
256-267 

base conversion of, 274- 276 

base of, 256, 259 

derivative of, 278 280 

discounting and, 266 

generalized, 257-259 

graphical form of, 256-257 

‘growth and, 260-267 

interest compounding and, 262 263 

logarithmie functions and, 272 273 


9-560 














Maclaurin scries of, 261 
natural, 259 
Exponential-fynetion rule 
of dillerentiation, 278 
of integration, 448 
Exponential law of growth, 255 
Exports, net. 213 
Extreme value, 221. 293-301 
Extremum, 221 
absolute ws. relative, 222 223, 
291, 319, 347 
constrained, 362, 372-374 
determinantal test for constrained 
extremum, 362 
determinantal test for relative 
constrained extremum, 362 
determinantal test for relative 
extremum, 317 
first-order condition for, 313 
global vs. lowal, 222 223 


in relation to concavity and 
convexity, 318-320 

in relation to quasiconcavity and 
quasiconvexity, 372-374 

strong vs. weak, 318 


Factor(s) 
discount, 266 
integrating, 489-490 
Factorial, 243 
Factoring 
of determinant vs. matrix, 95 
of integrand, 450 
of polynomial function, 38-39 
Fair bet, 232 
Fair game, 232 
Final demand, 113 
Finite Markov chains, 80 
Finite set, 9 
First-dorivative test, 223-226 
First-order condition, 234, 
294-295, 402 
derivative vs. differential form of, 
291-292, 293 
for extremum, 313 
necessary vs, sufficient, 295 
Fiscal policy, 534 
Fixed exchange rate, 214 
Fixed terminal point, 639 
Flow concept, 264, 466-467 
Fluctuation 
damped, 526, 561 
explosive, 525-526 
stepped, 574 575, 579, 580, 584 
time path with, 525-527, 534-537 
uniform, 526 
Focus, 618-619 
Form, 301 
Formby, J. P, 240n 
45Hdegree line, 564 
Fraction, 7 
Free optimum, 347 
Friedman, M., 533 
Funetion(s}, 17.28 
algebraic vs. nonalgebraic, 23 
argument of, 18 
circular, 23, 513-515 
Cobb-Douglas. See Cobb-Douglas 
production function 
complomentary. See Complementary 
functions 


concave vs. convex, 230-231, 
318-320 

consiant. 20, 21, 148 149, 187 

consumption, 46, 576 

continuous vs, discontinuous, 
141-142 

continuously differentiable, 
($4, 227 

cubic, 21, 22, 35n, 38, 238-242 

decreasing vs. increasing, 163 

defined, 17 

derivative, 127 

differentiable, 324-327, 368-372 

domain of, 18, 19 

exponential. See Exponential 
function(s) 

general vg. specific, 27-28 

graphical form of, 22, 516 

Hamiltonian. See Hamiltonian 
function 

homogengous. See Homogeneous 
functions 

homothetic, 394-395 

implicit, 194-199 

inverse, 163, 272, 622 

Lagrangian, See Lagrangian 
functions 

linear, 21, 22, 27 

logarithmic. See Logarithmic 
functions 

maximum-value, 428-435 

objective, 221, 313-317, 632, 644 

polynomial. See Polynomial 
fimetions 

production. See Production functions 

profit, 429-430 

quadratic, 21, 22, 27, 35-36 

quasiconcave vs. quasiconvex, 
364-371 

range of, 18, 19 

rational, 21-23, 142-143 

saddle point off 295, 299. 302 

sinusoidal, $14 

social-loss, 69 

step, 131, 552 

‘Taylor series of, 624 

transcendental, 23 

trigonometric, 23, 514 

‘of two variables, extreme values 
of, 293-301 

value of, 18, 19 

Zeros of, 36 

Function-ol-a-function rule, 162. See 

also Chain rule 


index 681 


G 


Gamkrelidze, R. V., 633n 
General-equilibrium analysis, 43 
Giffen gouds, 381 
Global extremum, 222 223 
Goal equilibrium, 31, 220 
Greek alphabet, 635 
Gross investment, 466 
Growth 
continuous vs. discrete, 265-266 
Domar model of, 471-474, 475 
exponential functions and. 260 267 
exponential law of 255 
instantancous rate of, 263-265, 
286-288 
negative, 266 
neoclassical optimal madel of, 
649-451 
rate of, 263 265, 286-288 
Solow model of, 498-502, 652 


H 


Hamiltonian function 
current-value, 645, 651 
for optimal control problems, 633, 
634, 635-638, 641, 042, 651 
Hawkins, D., 1 1 6n 
Hawkins-Simon condition, 116 
economic meaning of, 118-119 
principal minor and, 304, 305, 306, 
314 
Hessian determinant, 404, 314, 316 
bordered, 358-363, 371-372, 439n 
Jacobian determinant in relation to, 
343-344 
Hessian matrix, 314 315 
Hicksian demand functions, 436 
Homogeneous equation, 476, 478 
Homogencous-equation systera, 
105-106, 119-120, $95, 598 
Homogeneous functions 
economic applications of, 382, 
383-390 
linearly, 383-386, 388-389 
Homotheti¢ funetion, 394-395 
Horizontal inlereept, 274 
Horizontal terminal linc, 639, 640-643 
Hotelling’s lemma, 430, 432, 438 
Lyperbola, rectangular, 21-23, 561, 580 
Hypersurfuce, 26 


682 Index 


i, the number, 511 
Idempotent matrices, 71, 73, 78 
Identity. 6 
equilibrium, 206, 208, 211, 212 
Roy's, 437 438, 440 
Identity matrix, 58, 69, 70-71 
Image. 18, See also Mirror images 
Imaginary axis, 512 
Imaginary number, 311 
Implicit function, 194-199 
Implicit-function rule, 197-198, 
202. 387 
Implicit-function theorem, 196, 198n, 
199 200, 201 
application procedure, 216-217 
applied to national-income models, 
203-204, 210-213 
applied to optimization models, 
343-345, 353-354, 378 
Income effect, 380, 381 
Income increment, 547 
Indetinite integral, 446 454, 460 
Independence. See Dependence 
Independent variable, 18 
Indifference curve, 375-378 
Induced investment, 576 
Inequality, 136-139 
absolute values and, 137 138 
continued, 136 
rules of, 136 
sense of, 136 
solution of, 138-139 
Inequality constraints, 404-408 
Inferior youd, 379 
Infinite integrand, 463-464 
Infinite series, 261, 517-819 
Infinite set, 9 
Infinite time horizon. 649 653 
Inflation, 533 
actual vs. expected rate of, 536 
monetary, 629 
unemployment and, 532-537. 
SRE-5RS_ 609 614 
Inflation expectations, 533, 536, 581 
Inflection point, 225, 231, 234n, 
152, 205 
Initial condition, 445 
Inner product, 54 
Tnput coefficient. 113 
Input-coctficient matrix, 113-114 
Input decision, 336-341 
Input-decision model. 343-345 
Input demand, 113 


input-output model 
closed, 119-120 
dynamic, 603-609 
Leontief, 112 121 
open, 113-116 
static, 112-121 
Instantaneous rate of change, 126 
Inslantaneous rate of growth, 263 265, 
286 288 
Integers, 7 
Integral, 446, 475 
definite, 447, 454 461 
economic applications of, 464-470 
improper, 461-464 
indefinite, 446-454, 460 
lower vs. upper, 457 
ofa multiple, 450 451 
particular. See Particular integral 
Riemann, 457, 459 
ofa sum, 449-450 
Integral calculus, 445 
Integral sign, 446 
Integrand, 446 
factoring of, 450 
infinite, 463464 
Integrating factor, 489-490 
Integration, 445 
constant of, 446 
dynamics and, 444-446 
limits of, 434, 460, 461-463 
by parts, 452-453, 460 
Integration rules 
exponential rule, 448 
integeation by parts, 452-453, 460 
logarithmic rule, 448 
power rule, 447 
tules of operations, 448-451 
substitution rule, 451 452 
Intercept 
horizontal, 274 
vertical, 21 
Interest compounding, 262-263 
Tnterier solution, 403 
Intersection set, 11 
Intertemporal equilibrium, 480, 481 
interval, closed vs. open, 133 
Invariance property, 382 
Twventory, market model with, $59 562 
Tnverse, 56 
Inverse function, 163, 272, 622 
Inverse matrices 
finding, 99-103 
properties of, 75-77 
solution of linear-cquation system 
and, 77-78 


Tavestment, 211, 471 474 
capital formation and, 465-467 
dynamics of, 498-502 
gross, 466 
induced, 576 
niet, 466, 467 
replacement, 466 
Irrational number, & 
Isocost, 391 
Tsoquant, 339-341. 391, 392 394 
Isovalue curves, 392n 
Iterative method, for difference equation, 
546-548 


Jacobian determinant, 45 
endogenous-variable, 203. 208, 212, 
343 344, 353 
in relation to bordered Hessian, 359 
in relation to Uessian, 343-344 


K 


Keynes, J. M., 46, 576 
Keynesian multiplier, $76 
Kuhn, H.W, 402n, 424 
Kuhn-Tugker conditions, 402 412 
economic interpretation of, 408 409 
effets of incquality constraints, 
405-408 
minimization version of, 410 
optimal control theory and, 640. 
Kuhn-Tucker suiTicieney theorem, 
424 425 


Lag 
in consumption, 576 
in production, 603-605 
in supply, 555 
Lagrange, J. L., 126-127 
Lagrange form of the remainder, 
248-249 
Lagrange multiplier 
current-value, 645 
economic interpretation of, 353 354, 
375,391 
ecneral interpretation of, 434-435 
Lagrange-multiplier method, 
350-352. 353 


Lagrangian functions 

in finding stationary values, 
350-352, 354-355 

in nonlinear programming, 403, 
409, 410 

Laplace expansion 
by alien cofactors, 99-100 
evaluating an nth-order delerminant 

by, 91-93 

Latent root, 3070 

Layson, S., 240n 

Least-cost combination of inputs, 

390-401 

Leibniz, G, W., 127 

Leonticf, W. W., 112 

Leontief input-output models, 112-121 

Leontief matrix, 115, 116 

LHépital’s rule, 399, 400, 

Lifetime utility maximization, 645-647 

Limit, 129-135 
evaluation of, 131-132 
formal view of, 133-135 
of integration, 454, 460, 461-463 
lefl-side vs. right-side, 129-131 
of polynomial function, 141 

Limit theorems, 139-141 

Linear approximation, to a 

function, 246-248 

Linear combination, 61, 62 

Linear constraints, 416-418 

Linear dependence, 62-63 

Linear-equation system, 48, 

71-18, 106-107 

Linear form, 301 

Linear function, 21, 22, 27 

Linear programming, in relation to 

nonlinear programming, 402 
Linearization. See Linear approximation 
Linearly homogeneous functions, 

383 386, 388-389 
Linearly homogeneous production 

functions, 384-386 

Literary logic, 3 

In, 268 

Loeal extremum, 222-223 

Log, See Logarithmts) 

Logarithm(s), 48-49, 257, 260-272 
common vs. ngtural, 268-269 
conversion formulas, 271 
elasticity and, 289 
meaning of, 267-268 
tules of, 269-27] 

Logarithmic functions, 22, 23, 272-277 
base of, 267-269 
exponential functions and, 272-273 


‘Logarithmic-function rule 
of differentiation, 277 278 
of integration, 448 
Logic, mathematical vs. literary, 3 


M 


Machlup, F, 30n, 444n 
Maclaurin series, 242-243 
convergent, 261 
of cosine function, 518 
of exponential function, 261 
of polynomial function, 242-243 
of sine function, 518 
Mapping, 17-18 
Marginal cost 
average cost vs., 159-160 
total cost vs., 128-129, 153, 
464-465 
Marginal physical product, 198 
diminishing, 340, 499 
of labor, 163 
Marginal product, value of, 339 
Marginal propensity to consume, 
46, 211, 547 
Marginal propensity to save, 465 
Marginal rate of substitution, 375 
Marginal rate of technical 
substitution, 391. 
absolute value and, 199 
elastigity of substitution and, 396n 
Marginal revenue 
average revenue vs., 156-158 
upward-sloping, 240-241 
Marginal revenue product, 163 
Marginal ulility of money, 375 
Market models, 31-44, 107-108 
comparative statics of, 205-207 
dynamics of, 479-483, 527-532, 
555-562, 565-567 
with inventory, 559-562 
Markct price, dynamics of, 479-483, 
527-532, 555-562, 565-567 
Markov chains, 78-81 
absorbing, 81 
finite, 80 
Markov transition matrix, 79-80 
Marshallian demand, 435, 437, 438, 439 
Mathematical economics 
defined, 2 
econometrics vs. 4 
nonmathematical economics 
¥s.,2-4 
Mathematical logic, 3 


index 683 


Mathematical model, 5-7 
Mathematical symbols, 656-658 
Mathematically binding solution, 420 
Matrices, 49-59 
addition of, 51-52. 67 
as arrays, 49-50 
characteristic, 308 
coefficient, 50 
cofactor, 100 
defined, 50 
diagonal, 69, 73 
diagonalization of, 310-311 
dimension of, 50, 53, 
division of, 56 
echelon, 86-87 
elements of, 50 
equality, 51, 56 
factoring of, 95 
Hessian, 314-315 
idempotent, 71, 73, 78 
identity, $5, 69, 70-71 
inverse, 75-78, 99-103 
Jaws of operations on, 67-70. 
lead vs, lag, 53,54 
Leonticf, 115, 116 
Markov transition, 79-80 
multiplication of, 53-36, 58. 
59-60, 68-69 
nonsingular. See Nonsingularity 
hull, 71-72 
rank of, 85-87, 97-98 
scalar multiplication of, 52 
singular, 72, 75 
square, 50, 88, 96 
subtraction of, 52, 67 
symmetric, 74 
transpose, 73-74 
vectors as, 50-51 
vero, 71-72 
Maximum. See Extremum 
Maximum principle, 633-639 
Maxinum-value functions, 428-435 
McShane, E. J, 253n. 
Mean-value theorem, 248 
Metric space, 65 
Minhas, 3. §., 397n 
Minimization of cost, 390-401 
Minimization version of Kuhn-Tucker 
conditions, 410 
Minimum. See Extremum 
Minor 
bordered principal, 361. 362 
principal, 116- 118, 304. 305, 
306, 314 
Mirror effect, 534, 356 


684 index 


Mirror images 
in bordered Elessian, 363 
in exponential and log functions, 
273-274 
in symmetric matrix, 74 
in time paths, 554 
Mishehenko, L. F, 633n 
Mixed partial derivatives, 296 
Models and modeling 
closed, 119 120. 
of closed economy, 199-111 
cobweb, 555-558 
economic, 5-7 
market. See Market models 
mathematical, 5 7 
national-income. See National 
income models 
open, 113-116 
Modulus, 137, 512 
Monetary policy, 534, 581 
Monetary rule, 629 
Money, marginal utility of, 375 
Money illusion, 38] 
Motion, equation of, 631,633 634 
Multiconstraint eases, 354 355, 362-363 
Multiple roots, 508 
Multiplicative constant, 153 
Multiplier 
interaction of, with accelerator, 
STG-S81 
Keynesian, 576 
Lagrange. See Lagrange multiplier 
Muitiproduct firm, 331-333, 342-343 








nespace, 60, 64, 65 
nevariable, 307, 354-355 
n-vector, 60 
National-income models, 46 47, 108 109 
comparative statics of, 210-213 
dynamics of, 576-581 
equilibrium in analysis of, 46-47 
implicit-fonction theorem applied to, 
203-204, 210-213 
Natural exponential funetion, 289 
Necessary condition, 82-84, 234-235, 
237, 357-358, 424 
Necessary-and-sufficient condition, 83, 
84, 405 
Negative area, 458 
Negative definiteness, 306 
conditions for, 307, 314 
definite vs, indefinite, 392 


Negative growth, 266 
Negative semideliniteness 
conditions for, 311 
definite vs. semidetinite, 302 
Neighborhood, 133-134 
Neoclassical optimal growth model, 
649-651 
Nerlave, M., 558 
Net exports, 213 
Net investment, 466, 467 
Neyman, J., 402n 
Node, 618, 626, 627, 629 
Nonalgebraic function, 23 
Nonconstant solution, 478 
Nonconvergent time path, 526 
Nondenumerable set, 9 
‘Nongoal equilibrium, 31 
Nonhomogencous equation, 476-478 
Nonlinear programming, 356n 
constraints in, 404-408 
economic applications uf, 418-424 
in relation to linear 
programming, 402 
sufficiency theorems in, 424-428 
Nonmathematical economics, vs. 
mathematical economics, 2-4 
Nonnegative solution, 116 118 
Nonnegativity restriction, 402. 403 
Nonsingularity, 75 
conditions for, 84-85, 96-97 
test of, 88-94 
Nontrivial solution, 106, 600 
Normal good, 379 
Normalization 
of characteristic vector, 308 
of differential equation, 475n, 
Neh-derivative test, 253-254 
Nad? matrix, 7-72 
Null set, 10 
Nudl vector, 61, 62 63 


a 


Objective function, 221 
with more than (vo variables, 
313-317 
in optimal control theory, 632, 644 
Obst, N, PB, 629 
Official settlement, change of, 214n 
One-to-one correspondence, 16, 60, 
163, 165 
Open-economy equilibrium, 214-216 
Open input-outpul model, 113.116 
Open interval, 133 


Operator symbol, 149 
Optimal control 
illustration of, 632-033 
nature of, 631-639 
Optimal control theary, 631-654 
alternative tetminal conditions 
and, 639. 644 
autonomous problems in, 644-645 
economic applications 
of, 645 649 
Pontryagin’s maximum principle 
in, 633-439 
‘Optimal growth model. 
neoclassical, 649-65] 
Optimal input, clasticity of, 395 
Optimal timing, 282 286 
Optimization. See aise Constrained 
extremunl 
constrained, 432-433 
dynamic, 442, 631 
maximization and minimization 
problems and, 221 
unconstrained, 428-432 
Optimization conditions, 7 
Optimum. constrained vs. rec, 347 
Optimum output, 236 
Optimum value, 221 
Ordered #-tuple, 50 
Ordered pair, 15-16, 17 
Ondered sets, 15 
Ordered triple, 16 
Ordinate, 36 
Orthant, 369 
Orthogonal vectors, 309 
Orthonormal vectors, 310 
Oscillation, 552. 565 
explosive, 566, 596 
time path with, 556-558, 561-562, 
565-567 


Parabola, 21 
Parallelogram, 61-62 
Parameter, 6 

distribution, 397 

efficiency, 388. 397 

substitution, 397 
Partial derivative 

cross (mixed), 296 

second-order, 295 297 
Partial elasticily, 186. 187 
Partial cquilibrium, 31, 43 
Partial total derivative, 192, 193 


Particular integral 
of first-order difference equation, 549 
of first-order differential equation, 
477, 478 
of higher-order difference equation, 
569-370 
of higher-order differential equation, 
504-305 
intertemporal equilibrium and, 
481, 504 
of simultancous difference 
equations, 597 
of simultaneous differential 
equations, 599 
of variable-term difference 
equation, 586-588 
of variable-term differential 
equation, 538-540 
Payoff, 231 
Perfect foresight, 537 
Period, 516, 944 
Period analysis, $44 
Perpetual flow, present value of, 470 





Phase diagram 
analyzing, 653 
constructing, 652-653 
for difference equation, 562-567 
for differential equation, 495 498, 
500-501 
for differential-equation system, 
614 623 
dynamic stability of equilibrium and, 
495-498, 562-565, 619-620 
Phase line, 495. 563, 565 
Phase path, 617 
Phase space, 615 
Phasc trajectory. 617 
Phillips, A. W., 532n 
Phillips relation, 532 533 
expectations-augmented, 
533-534, 581 
long-run, 537, 585 
Point convept of time, 264 
Point elasticity, 288 289 
Point of expansion, 242 
Polar coordinates, 520 
Polynomial equations 
higher-degree, 38-40 
touts of, 38 40, 541 
Polynomial functions, 20-21 
continuity of, 142 
degree of, 21 
factoring of, 38 39 
limit of, 141 


Maclaurin series of, 242. 243 
Taylor series of, 244-245 
Pontryagin, L. $.. 633n 
Pontryagin’s maximum principle, 
633-639 
Positive definiteness, 306 
conditions for, 307, 311 
definite vs. indefinite, 302 
Positive integers, 7 
Positive semidefiniteness 
conditions for, 311 
definite vs, semidefinite, 302 
Power-function rule, 149-1 52 
in finding total differential, 187 
of integration, 447 
Power series, 242 
Present valuc, 266 
of cash flaw, 468-469 
of perpetual flow, 470 
Price, time path of, $29--532 
Price ceiling, 560 
Price discrimination, 333-336 
Price expectations, 27-528, 558 
Primal problem. 435 
Primary input, 113 
Primitive function, 126 
Principal diagonal, 55 
Principal minor, 116-118 
bordered, 361-362 
Tlawkins-Simon condition and, 304, 
305, 306, 314 
Product 
Carlesian, 16 
direct, 16 
inner, 34 
marginal, 339 
marginal physical, 163, 198, 
340, 499 
marginal revenue, 163 
scalar, 60, 66 
Produgt limit theorem, 140 
Product rule, 155 156, 187 
Production functions 
CES, 397-399 
Cobb-Douglas. See Cobb-Douglas 
production function 
Vinearly homogeneous, 384-386 
strictly concave function applied 
to, 341 
strictly quasiconcave function 
applied to, 392 
Profit, maximization of, 235-238 
Profit function, 429-430 
Proper subset, 10 
Pythagoras’ theorem, 65, 512, 635 





Index 685 


Quadratic equation 
quadratic function vs.. 35-36 
roots of, 36, 38-40, 507-510 
Quadratic forms, 301 
constrained, 358-359 
n-variable, 307 
sign definiteness of characteristic- 
root test, 307-311 
sign definiteness of determinantal 
test, 302 304 
three-variable, 305-307 
Quadratic formula, 36-37 
Quadratic function, 2t, 22, 27.35 36 
Qualifying ary, 415, 416 
Qualitative information, 157, 207 
Quantitative information, 157, 207 
Quantity discount, 131 
Quasiconcave function, 364-371. See 
also Strictly quasiconcave 
function 
CES function as, 398 
criteria for checking, 367-371 
explicitly, 372-373, 378 
in nonlinear programming, 425 426 
Quasiconcave programming, 425 
Quasiconvex function, 364-371 
criteria for checking, 367-371 
in nonlinear programming, 426 
Quotient, difference, 125-126 
Quotient limit theorem, 140 
Quotient rule, 158 159, 187 


Radian, 514. 515 

Radius vector, 60 

Range, 18, 19 

Rank, 85-87, 97-98 

Rate of change, 125 
instantaneous, 126 
proportional, 286n 

Rate of decay, 266 

Rate of growth 
finding, 286-288 
instantaneous, 263-265, 286-288 

Ration constraint, 418-420 

Rational function, 21-23 
continuity of, [42-143 
defined, 21 

Rational number, 8 

Razor's edge, 473-474 

Real line, 8 


686 Index 


Real number system, 7-8 
Real roots, 507-509 
distinct, 307-508, 570-571 
repeated, 508-509, $71, 379, 583 
Reciprocal, 36 
Reciprocity conditions, 430-432 
Rectangular hyperbola, 
21-23, 361, 580 
Reduced equation, 477 
Reduced-form solutions, 342-343 
Reduced linearization, 625 
Relation, (6 
Relative extremum, 222-223, 291, 347 
determinantal test for, 317 
Taylor series and, 250-253 
Remainder 
Lagrange form of, 248 249 
symbol for, 245n 
Repeated real roots, 508-509, 371. 
579, 583 
Reptacement investment, 466 
Resource, exhaustible, 647 649 
Restraint, 348, See aése Constraint 
Retums to scale 
constant. See Constant returns to 
seale (CRIS) 
decreasing and increasing, 390, 401 
Ridge Tines, 339 
Riemann integral, 457, 459 
Risk, attitudes toward, 231-233 
Roots 
characteristic. See 
Characteristic reots 
complex, 507-510, $12 513, 
372-573, 379 
dominant, 574 
af polynomial equation, 3-40, 541 
of quadratic equation, 36, 38-40, 
507-510 
real, 507-509, $70-371, 579, 583 
Routh theorem, 542-543, 590 
Row vector, 50, 53, 53 
Roy's identity, 437-438, 440 


Saddle point 

of dynamic system, 618 

of function. 295, 299, 302 

stable and unstable branches of, 618 
Samvelson, P.A., 450, 542n, 576 
Saving function, 185, 465 
Scalar, 52, 59-60 


Scalar multiplication, 52 
Sealar product, 60, 6 
Scale effect, 553, 554 
Schur theorem, 598-599 
Second derivative, 227-233 
Second-derivative test, 233 234, 252 
Second-order condition, 298-300, 
33-316 
derivative vs. differential form of, 
292-293 
necessary vs. sulTicient, 234-235, 
298, 299, 357-358 
in relation to concavity and 
convexity, 318-33 | 
in relation to quasiconeavity and 
quasiconvexity, 364. 374 
role of, in comparative statics. 345 


Sccond-order total differential, 297-298, 


301. 302, 356-357 
Semilog scale, 287n 
Series. See alsa Maclaurin series; Taylor 
series 
convergence of, 249, 261 
infinile, 261, 517-319 
powor, 242 
Sel(s), S14 
complement of, 12 
denumerable vs. 
nondcnumerable, 9 
disjoint, LL 
emply, 10 
equality of, 10 
finite vs, infinite, 9 
intersection of, 11 
laws of operations on, 12-14 
null, 10 
operations on, 11-14 
ordered, 15 
relationships between, 9-11 
subset, 10 
union of, 11 
universal, 12 
Set notation, 9 
Shephard’s lernma, 438-441 
Side relation, 348. See also Constraint 
Sign definiteness 
characteristic-root test for, 307 311 
determinantal test for, 302-304 
positive and negative, 302 
Silverberg, F.. 428n 
Simon, H.A.. én 
Simullancous difference equations 
applied, 603-609, 612-613 
solving, 594 596 


Simultaneous differential cquations 
applied, 605 607, 610-612, 614 
solving, 599 -601 

Simultaneous-cquation approach, 

207-209 

Sine function, 514 
derivative of. 517 
properties of, $15-517 
table of values of, 515, 520 

Singular matrix, 72, 75 

Sinusoidal function, 514 

Slope, 21 

Slutsky equation, 380 

Smith, W. J, 240n 

Social-loss function, 69 

Solow, R. M., 44n, 397n, 474, 498 

Solow growth model, 498-502, 652 

Solution, 33 34 
boundary vs. interior, 403 
cconomically nonbinding, 420 
of inequality, [38 139 
mathemativally binding. 420 
nonconstant, 478 
nonnegative, 116-118 
nontrivial, 106, 600 
outcomes for lincar-cquation system 

106-107 
reduced-form, 342-343 
trivial, 105 
verification of, 478-479 

Square matrix, 50, 88, 96 

State equation, 633 634, 644-045 

State variable, 631, 633 

Static analysis 
Leontlicf input-output models, 

112-121 
limitations of, 120 121 

Statics, 31. See afso Comparative statics 

Stationary equilibrium, 482 

Stationary point, 224 

Stationary state, 501 

Stationary values, 224, 349-355 

Steady state, 501 

Step function, 131,552 

Stock concept, 264, 466 

Streamlines, 617-018 

Strictly concave functions, 318-320 
applied to production functions, 341 
criteria for checking, 320-324 
defined, 230 
strict vs. nonstrict, 318 

Strictly convex functions 
applied to indifference curves, 

376 377 


applied to isoquants, 341 
criteria for checking, 320 324 
defined, 230 
strict vs. nonsinict, 318 

Strictly quasiconcave function, 364-371 
applied to production function, 392 
applicd to utility function, 377 
Cobb-Douglas function as, 386 
criteria for checking, 367-371 

Strictly quasiconvex function, 364-371 
criteria for checking, 367 371 
strict vs. nonstriel, 364 

Subset, 10 

Subsidiary condition, 348. See alse 

Constraint 

Substitutes, 41, 333, 337, 338 

Substitution 
elasticity of, 396, 397 
marginal rate of, 375 
technical, marginal rate of, 199, 

391, 3960 

Substitution effect, 380 381 

Substitution parameter, 397 

Substitution rule, 451-452 

Suen, W., 4281 

Sufficiency theorerts, 424 424 

Sufficient condition, 82 84, 234-235, 

357-358, 424, 425 

Sum-difference limit theorem, 140 

Sum-difference rule, 152. 187 

_ notation, 56-58 

Sum of squares, 60, 69 

Summand, $7 

Summation index, 57 

Summation sign, $6-$8 

Supply, 31, 32, 35 
lagged, 555 
with price expectations, 527 

Surface, 25 
concave or convex, 365 
hypersurface, 26 
utility, 377-378 

Symbols 
mathematical, 656-658 
operator, 149 
for remainder, 245n 

Symmetric matrix, 74 





T 


Takayama, A., 118n, 369n 
‘Tangent function, 514 


Taylor series, 242 
convergent, 249 
of functions, 624 
of polynomial functions, 244-245 
relative extremum and, 250-253 
with remainder, 245 
Taylor’s theorem, 245 
‘Terminal conditions, alternative, 639 644 
‘Terminal line 
horizontal, 639 
truncated horizontal, 640-643 
truncated vertical, 639-440 
Terminal point, fixed, 639 
Test vector. 415, 416 
Time horizon, infinite, 649-653 
Time path 
convergent, 526 
with fluctuation, 525-527, 534-537 
mirror images in, 554 
nonconvergent (divergent), 526 
nonoscillatory and 
onfluctuating, 579 
with oscillation, $56—558, 
561-562, 565 567 
phase-dliagram analysis of. See Phase 
diagram 
of price, 529-532 
steady, 481, 383-584 
with stepped fluctuation, 574-375, 
579, 580, $84 
types of, 496-498, 560, 564-566 
Timing, optimal, 282-286 
Total detivatives, 189-194 
applied to comparative statics, 
209-210 
partial, 192, 193 
Tolal differential, 184-187, 352-353 
of saving function, [85 
second-order, 297 -298, 301-302, 
356-357 
Total differentiation, 185, 190 
Trajectory, 617 
Transcendental function, 23 
‘Transformation, 17 18, 593-594 
Transitivity, 136 
Transpose, 73-74 
Transversality condition, 634, 637, 
639-640 
Triangular inequality, 65 
Trigonometric function, 23, 514 
Truncated horizontal terminal line, 
640-043 
‘Truncated vertical terminal line, 
639 640 


Index 687 


Tucker, A. W,, 4020, 424 
‘Twice continuously differentiable 
functions, 154, 227 


U 


Undetermined coefficients, method of, 
538-340, 586-588, 604, 607 
‘Unemployment 
inflation and, 532-537, 581-585, 
609-614 
monetary poliey and, 534 
natural rate of, 537, 585 
Uniform fluctuation, 526 
Union set, 11 
Unit circle, 523 
Unit vector, 63 
Universal set, 12 
Utility maximization, 374-382 
comparative statics of, 378 382 
exhaustible resource and, 647-649 
lifetime, 645-647 
Utilization, coetficient of, 473 


Valuc(s) 

absolute, See Absolute value 

critical, 224 

equilibrium, 32 

extreme, 22], 293-301 

of function, 18, 19 

of marginal product, 339 

optimum, 221 

present, 266, 468-469, 470 

stationary, 224, 349-355 
Vanishing determinant, 89, 95 
Variable(s), 302 

choice, 221 

continuous vs. discrete, 444 

control, 631 

costate, 633 


defined, 5 
dependent vs. independent. 18 
elimination of, 33-34, } 11. 116 


endogenous vs, exogenous, 5 6 
exponents as, 256 
state, 631, 633 

Vector{s) 
addition of, 61-62 
characteristic, 307, 308 


688 Index 


Veetor(s) (Continued) 
column, 50, 33, 55 
convex combination of, 328-330 
geometric interpretation of, 60. 62 
as matrices, 50.51 
null. 61.62 63 
orthogonal, 309 
orthonermal, 310 
radius, 60 
row, 50, 53,55 
test, 415, 416 
unit, 63 
zero, 61, 62-63 
Vevtor difference, 62 


Vector space, 63-65 

Venn diagram, 12 

Vertical intercept, 21 

Vertical terminal line, truncated, 
639-640 

Vorlex. 619, 627 


Ww 


Walras, L., 43, 45n 
Weighted average, 328 
Weighted sum of squares, 69 
Whole numbers, 7 


Y 


Youtig’s theorem, 296, 431, 432 


Zero matrix, 71-72 

Zero-value (vanishing) determinant, 
89, 95 

Zero vector, 61, 62 63 

Zeros ofa function, 36 


FUNDAMENTAL METHODS OF 


Mathematical Economics 





ALPHA C. CHIANG 
KEVIN WAINWRIGHT 


